How StockTwits Uses Machine Learning


Fascinating behind the scenes interview of StockTwit's Senior Data Scientist Garrett Hoffman. He shares great tidbits on how StockTwits uses machine learning for sentiment analysis. I've summarized the highlights below:

  • Idea generation is a huge barrier for active trading
  • Next gen of traders uses social media to make decisions
  • Garrett solves data problems and builds features for the StockTwits platform
  • This includes: production data science, product analytics, and insights research
  • Understanding social dynamics makes for a better user experience
  • Focus is to understand social dynamics of StockTwits (ST) community
  • Focuses on what's happening inside the ST community
  • ST's market sentiment model helps users with decision making
  • Users 'tag' content for bullish or bearish classes
  • Only 20 to 30% of content is tagged
  • Using ST's market sentiment model increases coverage to 100%
  • For Data Science work, Python Stack is used
  • Use: Numpy, SciPy, Pandas, Scikit-Learn
  • Jupyter Notebooks for research and prototyping
  • Flask for API deployment
  • For Deep Learning, uses Tensorflow with AWS EC2 instances
  • Can spin up GPU's as needed
  • Deep Learning methods used are Recurrent Neural Nets, Word2Vec, and Autoencoders
  • Stays abreast of new machine learning techniques from blogs, conferences and Twitter
  • Follows Twitter accounts from Google, Spotify, Apple, and small tech companies
  • One area ST wants to improve on is DevOps around Data Science
  • Bridge the gap between research/prototype phase and embedding it into tech stack for deployment
  • Misconception that complex solutions are best
  • Complexity ONLY ok if it leads to deeper insight
  • Simple solutions are best
  • Future long-term ideas: use AI around natural language