Keystroke Dynamics Ruby gem

The KSD gem 0.0.1 is out! This is my simple keystroke dynamics library for Ruby GTK widgets. Developers can help out on GitHub.

Here are some screenshots of the included examples.

Enroll with login
Enroll with login
or
Entroll with sentences
Enroll with sentences

Try to log in
Try to log in

If you do it right, you will see something like:

Verified user aczid with mean accuracy of: 0.585
Logged in successfully!

Update Apparently somebody in China found this cool enough to blog about it! (translated)


Dynamic DataMapper objects from imported CSV data

I have been working on a project that required some CSV data to be imported into a database. After I noticed DataMapper classes can be migrated through a class method, the idea of dynamically creating anonymous instances of DataMapper classes for imports occurred to me. In the code below the column types are known, but the column names are not. Here, I know all the columns except the primary key are of type Float. You could extend this example to add magic for determining the type of data, if you need it. This is experimental code, your mileage may vary

class CsvImporter

  attr_accessor :table_class

  require 'fastercsv'

  def initialize(filename)
    # CSV filename
    @filename = filename
    # Table column names array
    @table_columns = ['primary_key']
    puts "Parsing CSV file #{filename}"
    parse_file(@filename)
  end 

  # Returns sanitized name from filename.
  # Replaces dashes with underscores, removes slashes, removes .csv extension and prepends 'csvimport_'
  def self.table_name(filename)
    basename = File.basename(filename.to_s).to_s
    table_name = "csvimport_#{basename.gsub(/\.csv/, '').gsub(/-/, '_').gsub(/\//, '')}"
    table_name
  end 

  # Import CSV data into the database table using the ORM class
  def parse_file(filename)
    @parsed_file = FasterCSV.read(filename)
    analyze_header(@parsed_file.shift)
    create_table(CsvImporter.table_name(filename), @table_columns)
    n = 0 
    @parsed_file.each do | row |
      hash = row2hash(row)
      unless @table_class.first(:primary_key => hash[:primary_key])
        instance = @table_class.new(hash)
        if instance.save
          n+=1
          GC.start if n%50 == 0
        end
      end
    end
  end

  # Converts a row of CSV data to a ruby Hash.
  def row2hash(row)
    hash = {}
    row.size.times do |i|
      unless row[i].nil?
        hash[ @table_columns[i].to_sym ] = row[i]
      end
    end
    hash
  end

  # Analyzes CSV header and adds fields to @table_columns array
  def analyze_header(header)
    header.each do | column |
      # strips digit prefixes from CSV header and adds the result to 
      # table columns
      if column
        #column = "token_#{column.to_s}" unless column.to_s[0].is_a?(Integer)
        @table_columns.push column.to_s.gsub(/^\d+: /,'')
      end
    end
  end

  # Automagically creates an ORM class for the import using @table_columns array
  def create_table(name, columns)
    # creates a new table class with a primary_key property
    @table_class = Class.new do
      include DataMapper::Resource
      property :id, DataMapper::Types::Serial
      property :updated_at, DateTime
      property :primary_key, String
    end

    # set table name
    @table_class.storage_names[:default] = name
    # shift first element off because it is the primary key
    pk = columns.shift
    columns.each do | column |
      # Here, I know all the columns except the primary key are of type Float. You can extend this to add magic for determining the type of data.
      @table_class.property column.to_sym, Float, :precision => 11
    end

    # unshift PK back in place
    columns.unshift(pk)
    # dont destroy tables we already have
    unless @table_class.storage_exists?
      @table_class.auto_migrate!
    end
  end
end

The problem I had after this is that the anonymous object cannot be serialized in a traditionaly way. I decided to circumvent this by implementing a quick and dirty MySQL-specific DESC hack. I readily admit this is unstable, highly experimental code. If you plan to use it for any other purpose than mine, you will probably need to extend it a bit.

class CsvImporter
  def self.load_class(name)
    @table_class = Class.new do
      # Again, these types are known to always be there
      include DataMapper::Resource
      property :id, DataMapper::Types::Serial
      property :updated_at, DateTime
    end 
    @table_class.storage_names[:default] = name
    if @table_class.storage_exists?
      desc = repository(:default).adapter.query("desc #{name}")
      desc.each do |field|
        case field.type
        when /DateTime/i
          klass = DateTime
        when /Float/i
          klass = Float
        else
          klass = String
        end
        klass = DataMapper::Types::Serial if field.id == "id"
        if klass == Float
          @table_class.property field.id.to_sym, klass, :precision => 11
        else
          @table_class.property field.id.to_sym, klass
        end
      puts "Created field with id #{field.id.to_sym}, class: #{klass}"
      end 
    end
    @table_class
  end
end

And there you have it. The ability to work with you CSV imported data through a DM class, as if it has always lived in the database. I hope somebody besides myself finds this cool/useful.


Reversing 2-array axis in Ruby

Recently, I was working on a project that imports some CSV data into a dynamic database table. It needs to sort an array of floats. Along the way coding, I found myself doing something curious:

    rows = @table_class.all
    rows.each do | row |
      key = row.primary_key.to_sym
      @matches[key] = []
      row.instance_variables.each do | column |
        unless ['@id', '@repository','@primary_key','@original_values', '@new_record','@collection', '@updated_at'].include? column
          x = row.instance_variable_get(column)
          y = column.gsub(/@/, '') 
          @matches[key] << {:x => x, :y => y}
        end
      end 
      @matches[key] = @matches[key].sort_by { |match| match[:y] }
    end
    @matches

Sorting in Ruby! This smells bad. I put the data in a database for this?

The solution
The solution was to reverse the axis of the imported data, thereby enabling MySQL to sort the data for us. Instead of doing:

    n=0
    @parsed_file.each do | row |
      hash = row2hash(row)
      unless @table_class.first(:primary_key => hash[:primary_key])
        instance = @table_class.new(hash)
        if instance.save
          n+=1
          GC.start if n%50 == 0
        end
      end
    end

We can parse the file with inversed axis by doing:

    values = {}
    @parsed_file[0].enum_with_index.map do |primary_key, idx|
      if primary_key
        pk = primary_key.to_sym
        @parsed_file.collect do |row|
          if row[0]
            values[pk] = {} unless values[pk].is_a?(Hash)
            values[pk][row[0].to_sym] = row[idx]
          end
        end
      end 
    end
    n = 0 
    values.keys.each do |key|
      if values[key]
        unless @table_class.first(:primary_key => key.to_s)
          instance = @table_class.new(values[key])
          instance.primary_key = key
          if instance.save
            n+=1
            GC.start if n%50 == 0
          end
        end
      end
    end

I admit this is totally crazy code, and I don’t expect you to follow along. The rest of the class needed a bit of modifying too, but the first code example above has been simplified to:

    @matches[primary_key.to_sym] = @table_class.all(:order => [primary_key.to_sym.desc])

Ofcourse this hasn’t hurt performance, either


Moving to Phusion Passenger

This week I have moved my Ruby websites (which were previously running on Mongrel) to the Phusion Passenger Apache2 module. I have lived without apache for about a year, but I am really happy I switched back to it again. I am still using Nginx as a front-end proxy to serve static assets.
I am very pleased with Passenger because it makes deployment a lot easier! Basically, all Capistrano needs to do now for a deployment is move your app into the DocumentRoot and touch a "restart.txt" file. It supposedly works with any Rack-based web framework. I am using it with Merb and Rails.

I have more available memory and CPU cycles because there are no idle mongrels running, and availability is increased because new instances of the apps are spawned as needed (where memory is shared between multiple instances of an app).

Life is good with passenger!


Simple keystroke dynamics analyzer/validator written in Ruby-GTK

I have written a simple keystroke dynamics analyzer/validator as a school project. An instance of Analysis can be attached to a widget, and its collected keystroke data can be averaged and compared using class methods in Analysis.
The Validation class holds class methods to manage a password hashes file and save/load encrypted keystroke analysis metrics to/from disk.

Full documentation is provided through RDoc annotations.

The code will be publicly browseable at my code site when I get permission to host it from my teacher. This is now a gem, and the code is available on github!


Testing C++ in Ruby, continued

For a while I’ve been working with SWIG to generate wrappers for my C++ code. I ran into some problems when using it in a real project, so I’ve made the following adjustments to my method. First of all I’m using mkmf to generate a Makefile for the shared objects. This is how the SWIG documentation shows it, and I have added some lines to link in more libraries and run the swig command. I’ve added the -minherit flag to support C++ inheritance.

 require 'mkmf'
# Create wrapper module
`swig -c++ -ruby -minherit -Wall -o units_wrap.cpp units.i`

# Since the SWIG runtime support library for Ruby
# depends on the Ruby library, make sure it's in the list
# of libraries.
$libs = append_library($libs, Config::CONFIG['RUBY_INSTALL_NAME'])

# Also, we need the c++ libraries
$libs = append_library($libs, "stdc++")

# Create the makefile
create_makefile('units') 

This script, src/extconf.rb is executed by automake by adding the following in src/Makefile.am:

bin_PROGRAMS = example
example_SOURCES = example.cpp
noinst_unit_testsdir = .
noinst_unit_tests_DATA = units.so
EXTRA_DIST = extconf.rb autogen.sh
units.so:
     ruby extconf.rb
     make

I love this solution because now I only have 2 files to maintain compiler details instead of 3. src/units.i and src/Makefile.am. Also, by using the noinst_ prefix the module will not be included when I run ‘make dist’. You can see how it all fits together in my APR project.


Testing C++ in Ruby with Automake and RSpec

Since I’m a C++ newbie, I felt I needed more confidence in my code. As I’ve been programming in Ruby for almost 2 years now, I decided to look for a way to test my C++ code in Ruby. I found an excellent post by Dean Wampler that deals with just this topic. Please take the time to read this fascinating article before continuing. I happened to have been playing with autotools for a different C++ project last week, so I decided to use them for this one too. I decided not to put the SWIG stuff into the Makefile, but rather use a shell script. Automake will make sure the modules get built the way they need to be built, and I can put everything described in Dean’s article in a little shell script which builds the wrapper object for Ruby. I’m using a file src/units.i which loads some module headers for SWIG to wrap. This code is now irrelevant. I’ve found a way to generate this by using mkmf

#!/bin/sh

CXX=/usr/bin/g++
SRC_DIR=src
CFLAGS="-fPIC -fno-strict-aliasing -g -O2"
BIN_DIR=bin
RUBY_LIBS=/usr/lib/ruby/1.8/i486-linux
SWIG=/usr/bin/swig

cd $SRC_DIR

SRCS=`ls *.cpp | sed "s/.*_wrap\.cpp//"`
OBJS=`ls *.o`

`${CXX} -I. -I${RUBY_LIBS} ${CFLAGS} -c ${SRCS}`
`${SWIG} -c++ -ruby -Wall -o units_wrap.cpp units.i`
`${CXX} -I. -I${RUBY_LIBS} ${CFLAGS} -c units_wrap.cpp`
`${CXX} -shared -L. -rdynamic -Wl,-export-dynamic -L/usr/local/lib -o units.so ${OBJS} -lruby1.8 -lpthread -ldl -lcrypt -lm -lc`

You could extend it to take options for which module to build. As in my previous autotools example I have a ‘src’ dir containing my C++ code, and a new ‘spec’ dir containing my RSpec specifications in Ruby. The spec_helper.rb in the spec dir is taken from Dean’s example. Here’s an example of a spec for a FileReader class

require File.dirname(__FILE__) + '/spec_helper'
require 'units'

describe Units::FileReader do
  it "should be a constant on module Units" do
    Units.constants.should include('FileReader')
  end
end

describe Units::FileReader, ".new" do
  it "should create a new object of the type FileReader" do
    fr = Units::FileReader.new("file.txt")
    fr.filename.should_be "file.txt"
  end
end

describe Units::FileReader, "#openFile" do
  it "should open a file" do
    fr = Units::FileReader.new
    fr.openFile("file.txt").should_be true
    fr.filename.should_be "file.txt"
  end
end

describe Units::FileReader, "#closeFile" do
  it "should close a file" do
    fr = Units::FileReader.new
    fr.closeFile.should_be true
    fr.filename.should_be ""
  end
end

describe Units::FileReader, "#readChar" do
  it "should read a char" do
    fr = Units::FileReader.new("file.txt")
    char = fr.readChar
    char.should_be fr.ch
    char.is_a?(String).should_be true
    char.length.should_be 1
  end
end

describe Units::FileReader, "#readLine" do
  it "should read a line from the file" do
    fr = Units::FileReader.new("file.txt")
    line = fr.readLine
    line.is_a?(String).should_be true
    line.match(/^.*\n$/).should_not be_nil
  end
end

One little caveat I ran into: you have to expose the private variables of you C++ class with public accessors to be able to use them. Dean’s example shows these two little methods that do this, but it wasn’t obvious to me as my brain is cooked by programming in Ruby for too long. Hope this was helpful for anyone who wants to get started on testing their C++ code in Ruby easily.


I've bought a domain

I’ve finally bought my own domain, aczid.nl. Besides the blog, I want to host my music, code and other creations here.
All you readers out there can visit my blog at blog.aczid.nl.
I would like to thank Lasert for providing me with a 2 GHz/256 512 MB Xen slice on a big unclogged eweka tube. It’s a fun and stimulating experience to set up your own front-end/proxy server, subversion, et cetera.
I plan to do most if not all of the site in Ruby on Rails. The blog is served using Nginx, Mongrel and Typo.


Backgroundrb is pretty cool stuff

For my employer I'm developing a rails app that needs to do a lot of RPC calls in the background. It's not really putting strain on our servers, but that sort of stuff will always take longer than you want it to. At first I just wanted PHP-style output buffering. However that doesn't scale too well with rails setups being the way they are.
To improve stability and reliability, we need to seperate it from the rails environment copmletely.

Enter Backgroundrb.

Backgroundrb lets you do some heavy lifting in the background. It's basically a separate server that allows you to run worker threads.
Now you might ask: "How does this fit into rails?". Well, it goes a little something like this...

The worker (put this in /lib/workers)

class ExampleWorker < BackgrounDRb::Worker::RailsBase
  def do_work(args)
    results[:some_string] = "I'm robust!"
  end
end
ExampleWorker.register

Request for heavy lifting coming in:

def background_action
  session[:job_key] = MiddleMan.new_worker(:class => :example_worker, :args => {:arg => params[:some_arg]})
end

Response going out (this could be used in an AJAX calback action):

def callback
  @result = MiddleMan.worker(session[:job_key]).results[:some_string]
end

It's so simple I couldn't have thought it up!
You can name your worker jobs by passing :job_key in the :args hash.

Let me speculate that this will become a widely used plugin as rails is receiving more and more commercial adoption. Who knows, it might even get merged into rails core! It definetly blows PHP's output buffering out of the water!

Dependencies:
Slave, Daemons.



Me elsewhere: