Skip to content
forked from shilad/PyVowpal

Python wrapper for the Vowpal Wabbit machine learning library.

Notifications You must be signed in to change notification settings

reidpr/PyVowpal

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Overview

This library lets you stream training and test data into the Vowpal Wabbit machine learning library and then stream predictions out. These streams can be either pipes from Python (which the library translates to and from the VW text format appropriately) or files already in the VW format.

At no time do you have to keep the entire datasets in memory in Python (though there's nothing stopping you if you want to). However, you may need lots of disk space to hold temporary files.

Authored by Shilad Sen and Reid Priedhorsky ([email protected]) and distributed under the Apache license, version 2.

Example

This glosses over important details, but the basic idea for streaming directly into and out of VW (i.e., no preformatted files) is this:

import vowpal2

training_stream = vowpal2.InputStream()
test_stream = vowpal2.InputStream()
pred_stream = vowpal2.OutputStream()

vw = vowpal2.Model(training_stream, test_stream, pred_stream)

for ex in my_training_examples:
   training_stream.put(ex)

for ex in my_test_examples:
   test_stream.put(ex)
   print "prediction for %s is %s" % (ex, pred_streams.get())

The last stanza could have been written as follows. If you are streaming the test data and predictions, this is better as it avoids potential deadlock bugs should the pipes fill.

for ex in my_test_examples:
   print "prediction for %s is %s" % (ex, vw.predict(ex))

Dependencies

  • Python 2.x, version >= 2.7
  • The vw executable (from the main VW website), version >= 6.1

Older versions may work but are untested.

More info

The code is well documented, so check the help strings for classes and methods.

About

Python wrapper for the Vowpal Wabbit machine learning library.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%