First graduate leaves DS&AI
Robin van Hoorn is the first graduate of the Data Science & Artificial Intelligence (DS&AI) program to receive his diploma. Cursor spoke to Van Hoorn in the run-up to the big day on July 10. In his thesis, the double Master’s student combined two models into a new one that uses generative AI to create fake patient data. A big step towards quicker, easier and safer research using patient data, where the privacy of real patients is much less at risk.
The DS&AI Master’s program kicked off for the first time in 2021. Van Hoorn had had his sights set on the program focusing on data science and AI for a while longer already, but his patience was tested as its start was delayed by COVID. To make the most of the waiting time, he already started another Master’s (Innovation Management, ed.), intending to eventually follow the two programs next to each other. “That was the plan all along anyway, two Master’s programs,” Van Hoorn says drily.
“As part of my Bachelor’s (Computer Science in Engineering, ed.) I took the Competitive Programming and Problem Solving honors track, in the context of which I participated in an AI challenge together with two fellow students. We didn’t know anything about the subject yet, but it was fun and went pretty well.” This is where his love for the subject was born, it would appear. “Later that year I also followed a TU/e seminar about deep learning by Albert van Breemen. He also runs a startup – VBTI – for which I subsequently did a bit of work to see if it suited me.” And it definitely did. Van Hoorn is happy with what he learnt at DS&AI over the past two years. “For me, a real added value of the program was learning to read and implement papers that are published in the field of AI. It’s crazy how much is happening in this relatively new field. New and interesting papers on AI are published most every week. Being able to read those papers and translate them to one’s own practice is super useful and benefited me greatly in my Master’s thesis.”
Generative AI
Van Hoorn already defended his thesis, which was awarded an impressive 9 out of 10 (cum laude), on June 12. The title is impressive as well: ‘Generating privacy-preserving longitudinal synthetic data’. Its focus is on generative AI, the technology that can create deepfakes (among other things). But generative AI doesn’t only work with photos and texts but also with numbers.
“In healthcare, a lot of data is processed that is privacy sensitive. To conduct proper research in that sector or, for example, test applications, there’s a regular need for all kinds of patient data. The use of real data is preceded by a strict GDPR procedure, as this kind of data is sensitive and can only be used with permission. In addition, you constantly have to consider whether all data is really necessary. That’s a time-consuming process, which must be repeated for every application.”
For his thesis, Van Hoorn developed a model that can create patient data using generative AI. “The model utilizes general Generative Adversarial Network (GAN) technology, which means it only needs to be fed real data once to then create an endless stream of fake data. A big advantage is that privacy is much less of an issue at that stage, so researchers can perform easier, quicker and safer research or tests using patient data.” Empirically proving that this is the case is the research question for Van Hoorn’s Master’s thesis for Innovation Management. “In more concrete terms, I want to prove empirically that synthetic data has the potential to accelerate and improve the innovation development process within healthcare.”
Longitudinal data
Longitudinal data is data that changes over time, such as people’s blood pressure. Van Hoorn completed his graduation research at Philips and was able to use a data set from Catharina Hospital. “The data set I used was already being used in research by a PhD candidate. This made it easier to get clearance, but the process still took three months.” The fact that the basic data had already been used in the PhD candidate’s research created an interesting opportunity for Van Hoorn to test his model. He used the generated data to replicate the PhD research. “And that worked! Up to a point. There’s definite room for improvement, but it was already better than what previous models were able to do. Models for generating longitudinal data and other models for privacy data already existed, but one in which both aspects were combined had yet to be created. So that’s what I did.”
All options open
One diploma with a good final grade in the pocket (or almost), another one forthcoming. So that probably means the jobs are there for the taking? “There are many options, which is nice. I’d like to work for a large tech company, as those are where cool research and developments in the field of AI take place. But companies such as Google and Microsoft do have a hiring freeze at the moment.” Van Hoorn will first focus on rounding off his second Master’s, Innovation Management. He expects to do so in November of 2023. “Maybe the big companies will be hiring again by then. If not, I can always go into consultancy and acquire a ton of experience.” Having said that, he would like to build things himself. “I established Team Hart three years ago. And I could always found a new startup.” For the moment, he’s leaving his options open.
Are you also interested in taking this Master’s program revolving around Data Science and Artificial Intelligence? A lot of university Bachelor’s programs with a technical basis provide admission to this program. If you’re not eligible for direct admission, you can take a DS&AI-specific pre-Master’s program. More information is available online.
Discussion