Monthly Archives: June 2014

Leveldb v/s berkeley db

I had a requirement to index a good chunk of data (around 250M key value pairs) and wanted to try out both berkeleydb and leveldb. Here are some metrics when running on an amazon m1.xlarge machine.   Time to build the … Continue reading

Posted in Uncategorized | 1 Comment

How to create a set of indicator (booleans / onehot ) variables from a categorical (factor) variables in R

Here is an example of a categorical variable (factor in R) .  data = cbind(data,model.matrix( ~ 0 + user_state, data)) Here user_state is a variable containing 51 values (1 for each state in US).. After the operation, we end up … Continue reading

Posted in Uncategorized | Leave a comment