Not everyone has the courage to take the road less travelled. But Professor Jiaya Jia, a renowned expert in computer vision and artificial intelligence (AI) at CUHK, has been delving into uncertainties since he was a student. “Two decades ago, AI wasn’t a decent term because its level of intelligence was low,” he says. His foresight and tenacity in exploring AI, however, have crafted him an exceptional portfolio, including founding a tech unicorn and receiving the Best Innopreneur Award.
Riding the tech wave of global industry 4.0, Professor Jia founded SmartMore Corporation Limited in 2019, specialising in smart manufacturing optimisation and automation (MOA) for efficiency and quality enhancement. With more than 20 years of research experience, he has led his team to create intelligent platforms with great scalability and broad applications. The company exemplifies the University’s research commercialisation effort to growing the city’s innovation and technology (I&T) ecosystem.
Igniting the spark of intelligentisation
Next generation “smart machines”, Professor Jia notes, begin to replace “brainless traditional machines” in factories across Germany, Japan, the US and mainland China. They operate in the dark 24/7 without a human onsite, with sensors that are equivalent to eyes, ears and noses. They even have an AI “brain” intelligent enough to monitor production lines, spot defects and make decisions. That is a major advance over traditional factories, where “unthinking” machines mindlessly repeat mechanical actions.
It took SmartMore only 18 months to morph into a unicorn, a start-up valued over US$1 billion, after its inception. The company has more than 1,000 employees worldwide, of whom over 70% have master’s or doctoral degrees. The proportion of research and development (R&D) staff and R&D investment is over 60%. It has served more than 100 corporations from all over the world, including Airbus, Carl Zeiss and several Fortune Global 500 companies.
R&D expenditure on AI is very expensive as it involves advanced technologies. Tech giants like Google, OpenAI and Microsoft have spent billions of US dollars on investment aiming at artificial general intelligence (AGI), such as ChatGPT. It took 75 years for telephones and 3.5 years for WhatsApp to reach 100 million users. ChatGPT reached that number in just two months.
“It costs a lot to invest in AGI as the system needs to be intelligent enough to process all aspects of knowledge,” says Professor Jia. “The reason for SmartMore’s fast scaling up is that we solve the pain points of manufacturers with specific AI solutions.”
In recognition of his passion for driving smart manufacturing and digital innovation, Professor Jia was conferred the first Best Innopreneur Award by the Federation of Hong Kong Industries (FHKI) in mid-February this year. He says, “I’m honoured to receive the award. SmartMore pledges to promote smart manufacturing and support Hong Kong’s reindustrialisation initiative, thereby creating more job opportunities and improving the I&T ecosystem.”
A life-changing decision
Professor Jia admits he had almost no idea what a computer was when he left his hometown to study at Fudan University more than 25 years ago. He says: “I realised we had entered a new era using computers. So, I chose computer science as my major.” Today, he is a tenured professor at CUHK’s Department of Computer Science and Engineering.
His papers have since been cited more than 50,000 times and he is a fellow of the US-based Institute of Electrical and Electronics Engineers (IEEE). He is the Associate Editor-in-Chief of IEEE Transactions on Pattern Analysis and Machine Intelligence, the flagship journal in AI research. He has also played key industry roles with Microsoft Research Asia, Adobe Systems and Tencent.
He could have opted for research directions like networks, database design and algorithms. “But I like to see beautiful pictures, photos, images – visual content that’s not just a bunch of data or number values but something I can perceive directly. I said to myself that if I want to see results visually, why not chose image processing or computer vision?”
From science fiction to reality
Computer vision is now part of AI simulating human perception so that a computer can, for example, distinguish between images of a cat and a dog. But AI was not sophisticated two decades ago, according to the professor. Computers had difficulty with even the simplest tasks like recognising letters or characters on printed paper.
Until 2012, algorithms connecting hundreds or thousands of computer “neurons” – like synapses in a human brain – mostly produced gibberish. But when Canada-based scientists began applying algorithms to millions of neurons, they broke a new ground. Outputs became ordered and semantically meaningful. Computers could at last interpret an image and see the difference between a cat or dog.
“After that, we were called AI researchers,” Professor Jia says with a chuckle, “because computer vision was not fiction anymore. We could sense it would become powerful, useful and applicable to a lot of areas.” Professor Jia and his PhD students began another decade of hard work. Part of their research focuses on how to join computer “neurons” and combine multi-modal information including natural language, images, videos and sound. One of his PhD graduates Li Xu is now the CEO of SenseTime Group Ltd. – one of the largest AI companies in the world.
AI research is a race against time. “Every algorithm only works for about half a year. It’s not just that computer engineering is changing fast. The intelligence level of AI is getting higher and higher,” Professor Jia observes. “We need to keep our eyes very closely on what’s hot in this area if we’re going to catch the train.”
He and his top students keep an eye on the trends in their field. “The research intensity is enormous, which means a lot of people are researching the same problem at the same time. What may once have been a 50-year research problem can now be solved in half a year – then my PhDs need to find another topic to study.”
Currently Professor Jia has up to 40 PhD students. He draws on his 20-plus years of experience to help them see the big picture. “I alert them to the most important topics in the computer engineering community, so that they don’t waste valuable time working on obscurities.”
The Deep Vision Lab is a relevant initiative, which Professor Jia set up informally to gather computer engineering friends from other Hong Kong universities and top universities overseas to review key papers and the latest developments. They aim to identify the most important problems to be worked on over the next five to 10 years.
Still waters run deep
Professor Jia is taking on one of the toughest problems: how to bridge computer vision and natural language processing (NLP). “Originally, language and visual content processing were totally separate fields of research, but they are converging because the computer vision people are looking for NLP models to process visual data. There was also a time when NLP researchers used computer vision solutions.”
Success would make it possible to create a poster by talking naturally to a computer instead of typing in keywords. That is easier said than done. “The way we encode information and messages for vision and language is completely different,” says Professor Jia “Joining these two is undoubtedly one of the most important tasks in my research pipeline. In the near future, we will intensify efforts to yield our own optimal solutions at system level, involving billions of training parameters.”
“I’m not afraid of difficulties, as they’re part of the innovation process. Smart technologies improve people’s lives. I hope that my example can motivate more young people to explore new areas and create social value.”
By Jenny Lau