Since I had to use the R package smacof, yet the rest of my code usually is in python, I thought the fastest thing to do would be wrap smacof in python. Using the rpy2 lib, you can do this in quite a straightforward manner for any piece of R you’d like to use in python. The data exchange between numpy and R is also nicely taken care of.

Here is the standard US cities distances example in R/python/smackoff.

[code lang=”python”]

# numpy imports

from numpy import *

import pylab as plb

plb.ion()

from scipy.spatial.distance import pdist, squareform

# These imports take care of setting up the numpy->R conversion later

import rpy2.robjects as ro

from rpy2.robjects.numpy2ri import numpy2ri

ro.conversion.py2ri = numpy2ri

# import the wrapped smacof from R

from rpy2.robjects.packages import importr

rsmacof = importr(‘smacof’)

# the standard biggest 10 US cities pairwise distances dataset

# of length N*(N-1)/2

# modified from http://www.mathworks.com/help/toolbox/stats/briu08r-1.html

labels = [‘Atl’,’Chi’,’Den’,’Hou’,’LA’,’Mia’,’NYC’,’SF’,’Sea’,’WDC’]

dist = [587,1212,701,1936,604,748,2139,2182,543,920,940,1745,1188,713,1858,

1737,597,879,831,1726,1631,949,1021,1494,1374,968,1420,1645,1891,1220,

2339,2451,347,959,2300,1092,2594,2734,923,2571,2408,205,678,2442,2329]

if **name** == ‘**main**‘:

# convert the pairwise distances to distance matrix

d = squareform(dist)

# run smacof through rpy2 with the numpy.ndarray d, rpy2 takes care of

# numpy->R conversion because we set the correct flag earlier

stress = rsmacof.smacofSym(d, 2)

# turns out the embedding is returned in the index 3, a cast to array

# is all you have to do for the R->numpy conversion

data = array(stress[3])

data = -1*data # flip the axis to reflect the observer outside of earth (lolz)

# plot data + labels

plb.scatter(data[:,0],data[:,1])

for i,label in enumerate(labels):

plb.text(data[i,0]+.02,data[i,1],label)

plb.title(“US cities embedding by le Smackoff”)

[/code]

Edit: Seems like metric and nonmetric MDS are in queue for the next release of the machine learning library scikit-learn.

## Leave a Reply