Thomas Wiecki on Probabilistic Programming with PyMC3

A rolling regression with PyMC3 : instead of the regression coefficients being constant over time (the points are daily stock prices of 2 stocks), this model assumes they follow a random-walk and can thus slowly adapt them over time to fit the data best.

Probabilistic programming is coming of age. While normal programming languages denote procedures, probabilistic programming languages denote models and perform inference on these models. Users write code to specify a model for their data, and the languages run sampling algorithms across probability distributions to output answers with confidence rates and levels of uncertainty across a full distribution. These languages, in turn, open up a whole range of analytical possibilities that have historically been too hard to implement in commercial products.

One sector where probabilistic programming will likely have significant impact is financial services. Be it when predicting future market behavior or loan defaults, when analyzing individual credit patterns or anomalies that might indicate fraud, financial services organizations live and breathe risk. In that world, a tool that makes it easy and fast to predict future scenarios while quantifying uncertainty could have tremendous impact. That’s why Thomas Wiecki , Director of Data Science for the crowdsourced investment management firmQuantopian, is so excited about probabilistic programming and the new release of PyMC3 3.0 .

We interviewed Dr. Wiecki to get his thoughts on why probabilistic programming is taking off now and why he thinks it’s important. Check outhis blog, and keep reading for highlights!

A key benefit of probabilistic programming is that it makes it easier to construct and fit Bayesian inference models . You have a history working with Bayesian methods in your doctoral work on cognition and psychiatry. How did you use them?

One of the main problems in psychiatry today is that disorders like depression or schizophrenia are diagnosed based purely on subjective reporting of symptoms, not biological traits you can measure. By way of comparison, imagine if a cardiologist were to prescribe heart medication based on answers you gave in a questionnaire! Even the categories used to diagnose depression aren’t that valid, as two patients may have completely different symptoms, caused by different underlying biological mechanisms, but both fall under the broad category “depressed.” My thesis tried to change that by identifying differences in cognitive function rather than reported symptoms to diagnose psychiatric diseases. Towards that goal, we used computational models of the brain, estimated in a Bayesian framework, to try to measure cognitive function. Once we had accurate measures of cognitive function, we used machine learning to train classifiers to predict whether individuals were suffering from certain psychiatric or neurological disorders. The ultimate goal was to replace disease categories based on subjective descriptions of symptoms with objectively measurable cognitive function. This new field of research is generally known as computational psychiatry, and is starting to take root in industries like pharmaceuticals to test the efficacy of new drugs.

What exactly was Bayesian about your approach?

We mainly used it to get accurate fits of our models to behavior. Bayesian methods are especially powerful when there is hierarchical structure in data. In computational psychiatry, individual subjects either belong to a healthy group or a group with psychiatric disease. In terms of cognitive function, individuals are likely to share similarities with other members of their group. Including these groupings into a hierarchical model gave more powerful and informed estimates about individual subjects so we could make better and more confident predictions with less data.

Bayesian inference provides robust means to test hypotheses by estimating how different two different groups are from one another.

How did you go from computational psychiatry to data science at Quantopian?

I started working part-time at Quantopian during my PhD and just loved the process of building an actual product and solving really difficult applied problems. After I finished my PhD, it was an easy decision to come on full-time and lead the data science efforts there. Quantopian is a community of over 100.000 scientists, developers, students, and finance professionals interested in algorithmic trading. We provide all the tools and data necessary to build state-of-the-art trading algorithms. As a company, we try to identify the most promising algorithms and work with the authors to license them for our upcoming fund, which will launch later this year. The authors retain the IP of their strategy and get a share of the net profits.

What’s one challenging data science problem you face at Quantopian?

Identifying the best strategies is a really interesting data science problem because people often overfit their strategies to historical data. A lot of strategies thus often look great historically but falter when actually used to trade with real money. As such, we let strategies bake in the oven a bit and accumulate out-of-sample data that the author of the strategy did not have access to, simply because it hadn’t happened yet when the strategy was conceived. We want to wait long enough to gain confidence, but not so long that strategies lose their edge. Probabilistic programming allows us to track uncertainty over time, informing us when we’ve waited long enough to have confidence that the strategy is actually viable and what level of risk we take on when investing in it.

It’s tricky to understand probabilistic programming when you first encounter it. How would you define it?

Probabilistic programming allows you to flexibly construct and fit Bayesian models in computer code. These

Thomas Wiecki on Probabilistic Programming with PyMC3

Trending Articles

【豌豆字幕組】[藥屋少女的呢喃（藥師少女的獨語）/ Kusuriya no Hitorigoto][25][繁體][1080P][MP4]

【學界欖球】加拿大國際首奪全港賽冠軍基信相隔一年再封后

【英文字幕/OVA/冷门动画】装鬼兵系列美版两部全

出售: Alon Petite

博讯｜张磊帮助下，李源潮的儿子被耶鲁录取

Capture One Pro 21 (14.2.0) 中文版 - RAW轉檔軟體專業級影像處理軟體

致喬立建設道歉聲明

[一般] 四聖獸碎片用途

SM3268AB 8CE三星量产无法格式化

三條崙討海人故事…重建烏倉寮憶43年前船難

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

好用的照片后期处理软件【DxO PhotoLab Elite 5.4.0.4765 (x64) 多语言便携版】..

出售: SINE Othello 電源線

[Zero-Raws] Panty & Stocking with Garterbelt (BD 1920x1080 x264 FLAC)

[下载工具]Think4V utubedown(Youtube高清视频下载工具) v2.1.6 官方版2.1.3

有藍電流行車紀錄器分享文嗎

五代RAV4 降車身（機械車位因素）

jetBrains Product crack 2024 Java based

浪Live首位勇奪金曲男歌手Eason亦宸發行單曲「如果我不曾愛過」

同門四角戀？李沛旭喇舌「小郭雪芙」曾智希，蔡淑臻拍完婚紗...怒毀婚