

By relying on code source repository meta-data from Mozilla, and manually collected employment status, we built a dataset of the most active developers, both volunteer and hired by Mozilla. In this paper, we present an initial step towards predicting paid and unpaid open source development using machine learning and compare our results with automatic techniques used in prior work. While many studies have taken the employment status of developers into account, this information is often gathered manually due to the lack of accurate automatic methods. Identification of this status is important when we consider the transferability of research results to the closed source software industry, as they include no volunteer developers. Open source development contains contributions from both hired and volunteer software developers.
