Industry standard

Rewrite of the industry standard for algorithms – News

“I’m not kidding – we are doing something that has never been done before,” Yassine Dhouib ’24 said of the research he, Dara Levy ’23 and visiting assistant professor of computer science Dave Perkins are leading. summer. The trio are working on two different projects in the field of computing, aimed at improving and streamlining industry standard algorithms.

The first of these is the k-nearest neighbor (KNN) algorithm, commonly used in machine learning and data science. KNN allows the user to make predictions with unknown data by extracting a number of neighboring data points around a specified data point, a process which can take a long time.

Major: Computer Science

Birthplace: La Soukra, Tunisia

High school: Ariana Pioneer School

One application of KNN is in the healthcare industry, where it is used to sift through massive amounts of medical information and rely on certain characteristics to, for example, predict whether a patient has breast cancer or breast cancer. heart problems. This is where KNN can be most convenient and also where improvements would be most useful – by analyzing large amounts of potentially vital data.

Dhouib provided a basic explanation of how the team streamlined the KNN algorithm: “The way we did it is called AkNN, which stands for aggregate k-nearest neighbor. The way it works is if we used [for example] five neighbors to label a data point… we take these five closest neighbors and reduce them to this single data point. So as we go through the algorithm, the data we test keeps getting smaller and the execution time gets shorter. “

Of course, Dhouib noted that he and Levy need to make sure that as the execution time decreases, the accuracy of the program does not decrease as well. Levy said that doesn’t appear to be the case and their tweaks seem to have made the program “faster and more efficient without losing too much precision.”

The second project is what they called the “pivot” project. Pivots are used to break large sets of data into more manageable chunks, using medians to predict where certain values ​​would fall. Dhouib and Levy seek to create a “smarter” hub, which will allow them to execute the algorithm more efficiently.

At this point, they are nearing the end of their work. Dhouib and Levy have completed the KNN project, which has resulted in an article they hope will be published in a scientific journal. Likewise, they plan to write and publish an article on the results of the pivotal project by the end of the summer.

Dhouib and Levy both expressed appreciation for the collaborative nature of the research. “If we’re stuck, we try to help each other,” Levy said. A few times a week, they met with Perkins to discuss their findings and plan their next steps. “It was really fun,” Dhouib remarked.