Skip to content

My Beef With ConcurrentMap

July 26, 2013

Many times I’ve needed to use a container such as a hash map to cache objects that are expensive to compute.

And often the cache must be thread-safe so that multiple threads may access it.

And frequently I’ve had a code reviewer ask “why didn’t you just use a concurrent map?” when they saw the synchronization scaffolding surrounding access to the map.

And always my answer is the same: the API of ConcurrentMap is flawed.

An operation when caching is the lookup-create sequence, where if the item is already cached, use it, otherwise create it and save it for future use. When it’s expensive to create the object, usually you want this whole operation to be atomic to prevent threads from creating the object simultaneously. Using a map and locks, this is usually handled with something like:

  V lookupOrCreate(K key) {
     synchronized (myMap) {
       V val = myMap.get(key);
       if (val == null) {
         val = computeExpensivelyForKey(key);
         myMap.put(key, val);
       }
     }
     return val;
  }

The equivalent atomic lookup-create sequence with a ConcurrentMap uses its putIfAbsent function:

  V lookupOrCreate(K key) {
     return putIfAbsent(key, computeExpensivelyForKey(key))
  }

However, this doesn’t help much, because it must already know the expensive value at the time it looks it up (and if computing it is this cheap, then caching may not be necessary anyway). Because of this, I’ve never found much use for the ConcurrentMap container, at least in a caching context, and almost always opt for the non-concurrent maps with my own locking.

The API would be much improved if it included an additional putIfAbsent function taking a factory that is called when needed:

  interface ValueFactory<V> {
     V create(K key);
  }
 
  V putIfAbsent(K key, ValueFactory<V> factory);

Now you could call computeExpensivelyForKey in the factory, which the concurrent map would only invoke if absent.

I should point out that ConcurrentMap has a performance benefit because it allows concurrent access. So if you expect many threads to simultaneously access the map, it makes the better choice despite its lacking interface. If you want both concurrent access and atomic “lookup or create,” consider using the Cache from the Guava library. I recently discovered this library and was happy to see that it offers this interface. It also has advanced features such as expiration times, an LRU policy, etc. which are usually desired for caches anyway.

Troubleshooting Hadoop Daemons via HTTP Servlets

July 23, 2013

The Hadoop daemons expose useful data on HTTP “servlets” that are built into the code. Data exposed from these pages may help debug configuration problems, understand performance issues, etc.

This page is a reference because it’s nice to have these links handy in one place.

The addresses that the daemons use are:

  • JobTracker:    http://<JobTrackerHost>:50030
  • NameNode:    http://<NameNodeHost>:50070
  • TaskTracker:    http://<TaskTrackerHost>:50060
  • DataNode:    http://<DataNodeHost>:50075

The paths for these resources are:

  • /conf –   all the current configuration settings
  • /jmx –   JMX metrics
  • /metrics –   Hadoop metrics
  • /logLevel –   interface to change the log level
  • /stacks –   stack dump of running threads
  • /logs –   all the logs

Append the path to the address to get the data. For example, use http://my_jobtracker_host:50030/conf to get the configuration data from the JobTracker.

The Omnivore’s Hundred

June 18, 2013

I came across this list containing 100 different foods for omnivores to try. It originated several years ago from this blog, and the author invites you to copy this list to your own blog, and then highlight the items you’ve tried. Lot’s of people have done so; I guess I’ll jump on this bandwagon.

The instructions say:

  1. Copy this list into your blog or journal, including these instructions.
  2. Bold all the items you’ve eaten.
  3. Cross out any items that you would never consider eating.
  4. Optional extra: Post a comment here at www.verygoodtaste.co.uk linking to your results.

Judging from this list I’m a pretty adventurous eater. Although there’s a few things that I’ve counted (some of the weirder ones) where I can really only claim to have taken a bite in order to say I tried it. But now the experience of trying it suddenly comes in handy, eh? Also, there’s items I only have vague recollections of trying… I know I’ve tried it but can’t really pinpoint when or where.

The VGT Omnivore’s Hundred:

1. Venison
2. Nettle tea (I had nettle soup, I think that counts)
3. Huevos rancheros
4. Steak tartare
5. Crocodile
6. Black pudding
7. Cheese fondue
8. Carp (It’s likely I’ve had it but not sure)
9. Borscht
10. Baba ghanoush
11. Calamari
12. Pho
13. PB&J sandwich
14. Aloo gobi
15. Hot dog from a street cart
16. Epoisses (I’m not 100% sure on this one, but I think so)
17. Black truffle
18. Fruit wine made from something other than grapes
19. Steamed pork buns
20. Pistachio ice cream
21. Heirloom tomatoes
22. Fresh wild berries
23. Foie gras
24. Rice and beans
25. Brawn, or head cheese
26. Raw Scotch Bonnet pepper (Not a WHOLE RAW pepper, but you have to nibble before cooking to see how to use it)
27. Dulce de leche
28. Oysters
29. Baklava
30. Bagna cauda
31. Wasabi peas
32. Clam chowder in a sourdough bowl
33. Salted lassi
34. Sauerkraut
35. Root beer float
36. Cognac with a fat cigar (Probably not at the same time)
37. Clotted cream tea
38. Vodka jelly/Jell-O
39. Gumbo
40. Oxtail
41. Curried goat
42. Whole insects
43. Phaal (I want to try this, I’ve certainly had similar)
44. Goat’s milk
45. Malt whisky from a bottle worth £60/$120 or more
46. Fugu (I was debating whether to strike this or not… maybe I’d try it)
47. Chicken tikka masala
48. Eel
49. Krispy Kreme original glazed doughnut
50. Sea urchin (Don’t like it but I’ve tasted it)
51. Prickly pear
52. Umeboshi
53. Abalone
54. Paneer
55. McDonald’s Big Mac Meal
56. Spaetzle
57. Dirty gin martini
58. Beer above 8% ABV
59. Poutine
60. Carob chips
61. S’mores
62. Sweetbreads
63. Kaolin
64. Currywurst
65. Durian (Barely, one bite is all I could muster)
66. Frogs’ legs (I seem to recall trying this)
67. Beignets, churros, elephant ears or funnel cake
68. Haggis
69. Fried plantain
70. Chitterlings, or andouillette
71. Gazpacho
72. Caviar and blini (Well, not at the same time)
73. Louche absinthe (I had absinthe, maybe not Louche though, counting it anyway)
74. Gjetost, or brunost
75. Roadkill
76. Baijiu
77. Hostess Fruit Pie
78. Snail
79. Lapsang souchong
80. Bellini
81. Tom yum
82. Eggs Benedict
83. Pocky
84. Tasting menu at a three-Michelin-star restaurant. (Not yet to a three-star)
85. Kobe beef
86. Hare (Rabbit… close enough?)
87. Goulash
88. Flowers
89. Horse (Maybe unintentionally)
90. Criollo chocolate (Had upscale chocolate tasting so I’m counting this)
91. Spam
92. Soft shell crab
93. Rose harissa (I’ve had harissa, but unsure about the rose part)
94. Catfish
95. Mole poblano
96. Bagel and lox
97. Lobster Thermidor
98. Polenta
99. Jamaican Blue Mountain coffee
100. Snake

I was disappointed that there weren’t some other things on here, so here’s some bonus items:
101. Rocky Mountain Oysters
102. Chicken feet
103. Pig ear
104. Pig face
105. Duck tongue

Improved Testing JSONP Without a Server

May 6, 2013

The previous post presents a quick and dirty way to create a server for JSONP testing. While it works for the example provided, the assumption that the callback name is fixed creates shortcomings when used with many Javascript libraries.

When using some libraries, an ajax handler will proxy your callback through its own dynamically generated callback, so you cannot know a priori what function name to return in the JSONP response. For example, when using the YUI JSONP library, the callback requested in the URL query arguments function looks like:
/?callback=YUI.Env.JSONP.yui_3_9_1_1_1367896794667_292.

The jQuery example from the last post suggested using:

function myCallback(data) {
    // Here I can handle the response
}
 
$.ajax({type: 'GET', url: 'http://localhost:50000', dataType: 'jsonp'});

Because the test JSONP server was equipped to only respond with a fixed callback name (here “myCallback”), it works by bypassing the jQuery response handler (and possibly missing any cleanup code that the ajax function needs to do). Usually a JSONP query is sent like so:

$.ajax({
    type: 'GET',
    url: 'http://localhost:50000',
    dataType: 'jsonp',
    success: function(data) {
        // Here I can handle the response
    }
});

In this case, the anonymous function that handles a successful response will be invoked by jQuery, and my server receives the request:
/?callback=jQuery18204170536657329649_1367898203473&_=1367898203496.

So instead of using netcat, an approach that works more universally is to make an HTTP server. The server may examine the query to determine the requested callback function, for example using a regex. With a scripting language such as Python, this is not difficult to do:

#!/usr/bin/python
 
import optparse
import re
import socket
import sys
 
def main():
    """Basic JSONP tester that serves static json response."""
    parser = optparse.OptionParser("usage: %prog [options] file_to_serve")
    parser.add_option("-p", "--port", default=50000, type=int,
            help="Server port")
    parser.add_option("-d", "--debug", default=False, action="store_true",
            help="Show incoming headers")
    (options, args) = parser.parse_args()
    if len(args) != 1:
        parser.error("incorrect number of arguments")
 
    contents = ""
    with open(args[0], 'r') as json_file:
        contents = json_file.read().strip()
 
    response_template = \
            "HTTP/1.1 200\nContent-Type:application/javascript\n\n{0}({1});"
 
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    sock.bind(("", options.port))
    sock.listen(1)
    while True:
        (client, _) = sock.accept()
        data = client.recv(1024)
        if options.debug:
            print data
        match = re.search(r"callback=(\S+?)[\s&]", data)
        callback = match.group(1) if match else "callback"
        response = response_template.format(callback, contents)
        client.send(response)
        client.close()
    sock.close()
    return 0
 
if __name__ == "__main__":
    sys.exit(main())

To launch the server, just invoke it providing the name a file that contains the static JSON response to return (unlike the example in the last article, it should consist of only the valid JSON response, and not contain any HTTP header or function name).

Testing JSONP Without a Server

April 27, 2013

(Note: See this article for a better solution.)

This is a trick for testing javascript code that needs to contact a web service for its data, when you don’t want to use the live service during testing (or when the web service is not available).

  1. Create the example response in a file. Start the file with an HTTP header that includes content type, so that the code processing the response doesn’t choke on the data:
    HTTP/1.1 200
    Content-Type:  application/javascript
    myCallback({ "my json example response" : "yah!" });
    

    Notice that the JSON string is padded by a callback function. This is the expected format for JSONP, but “myCallback” should be replaced with the name of the callback function the caller is expecting.

  2. Now that the response is stored in a file (called response.json), set up a server for it:
    $ while [ 1 ] ; do nc -l 50000 < response.json ; done

    This starts netcat listening on port 50000. When it gets a connection it will send the contents of the file. Putting it in a loop allows it to repeat for multiple requests.

  3. Try it. Call the web service using curl, and it should echo back the file, sans header (add -v to the curl command to see the header included).
    $ curl localhost:50000
    myCallback({ "my json example response" : "yah!" });
  4. Now use the test server in javascript code. For example, in jQuery:
    function myCallback(data) {
        // Here I can handle the response
    }
     
    $.ajax({type: 'GET', url: 'http://localhost:50000', dataType: 'jsonp'});

    Once testing is done, replace the URL with the actual service.