For 24 hours, students used their programming skills to compete in the second annual Data Science Hackathon, otherwise known as “Datathon,” in the University Union this weekend.
Hosted by Binghamton University Data Science and Analytics (BUDSA), a campus organization that focuses on teaching and exploring a data-driven campus community, the Datathon allowed students to work in teams to apply their knowledge of programming to data science and analytics-based projects. Participants were also encouraged to come up with creative projects that aim to benefit the Binghamton community. According to Joshua Eimer, vice president of BUDSA and a junior double-majoring in computer science and mathematics, the event is a culmination of BUDSA’s objectives.
“The goal of BUDSA is to give students a safe area to learn skills, practice those skills and then network,” Eimer said. “This event ties together all three parts of our mission statement.”
According to Eimer, Datathon allows students in BUDSA, who have spent all year learning how to use programming languages like Python, R and Excel through tutorials, to utilize their skills and practice building something. Companies donated datasets to the Datathon and after students complete their projects, companies will use their projects for recruitment.
“The companies that donate datasets will come and give a talk and will also send the projects that the students built, and this is used for recruiting material,” Eimer said.
Off Campus College Transport (OCCT), for example, donated time series data about the time people got on and off their buses, as well as how many people got left behind at each stop.
“This project is specifically geared towards Binghamton,” Eimer said. “The goal is to optimize their bus schedule so if there’s a specific time of day when more people get on the bus, they want to send out buses more frequently.”
Dhyanesh Thatchinamoorthy and Reshma Barvin Shahul Hameed Amanullah, first-year graduate students studying computer science, helped analyze the OCCT data and won first place overall in the competition. According to Amanullah, they specifically chose OCCT data since it directly impacts BU students.
“We selected OCCT data because we use this data every day boarding the bus,” Amanullah said. “Skipping just one bus is a big problem. We analyzed the data and found that from 10 to 11 [a.m.] on Tuesday, the buses are too crowded, so that’s why we created a new route so that there would be less problems and less people left behind.”
Yogesh Jagdale, a first-year graduate student studying computer science, was part of a group that won first place for the “Live in Bing” dataset category. Jagdale said their project specifically focused on the correlation between the prices of houses and other factors.
“We had datasets about the houses, and we had to find correlation between the prices of the houses and information on bedrooms, bathrooms, zip codes, floor space and parking,” Jagdale said. “We predicted the graphs based on the data given.”
Emad Alenany, a first-year graduate student studying industrial engineering, said he participated in Datathon to see his skills evaluated by others involved with data science and enhance his résumé. While working with the dataset in the “Live in Bing” category, he said his team tried to use various methods to approach the data.
“At the beginning, we thought of the problem as a classification problem instead of a regression problem,” Alenany said. “We converted the numerical variables into categorical variables to use one of the classification methods. Using data from ‘Live In Bing,’ our objective is to predict the price of apartments based on a number of 22 variables. Some of them are categorical and one is numeric. We try to study the problem by using some dimension reductionality methods.”
According to Eimer, the Datathon has expanded and improved from when it began just last year.
“Last year, this was a 12-hour event, and we only had two companies donate datasets,” Eimer said. “We also started with a zero-dollar budget and had to fundraise everything or spend out of pocket. This year, we had a budget from the Student Association, we had four companies that gave datasets and it’s a 24-hour event. As we have gained more popularity across Binghamton, we have more people coming that are serious about competing and are here for the benefits of the competition rather than the novelty.”