Audio, Speech and Language Processing include speaker recognition, speaker diarization, speech synthesis, voice conversion, speech separation, key word spotting, speech recognition, speech enhancement, language identification, paralinguistic speech attribute recognition, etc. More than 100 top conference or journal papers have been published in this field. Collaboration with multiple industry leaders and local companies are ongoing in terms of collaborative research and technology transfer.
Multimodal Behavior Signal Analysis and Interpretation have ben conducted towards the AI assisted Autism Spectrum Disorder (ASD) diagnose and intervention. An AI studio is developed for the early screening of ASD. The studio’s four walls are programmable projection screens that can recreate a variety of settings, such as a forest environment, with sound delivered through multichannel audio equipment. The therapist can use the studio to interact with the child, such as asking him or her to point at a certain object projected onto the wall to observe their reaction. At the same time, cameras capture the movements of the child and the therapist, including gestures, gazes and other actions. The studio is equipped with more than 10 technologies that have obtained or are in the process of obtaining patents. These include technologies that assist with gaze detection, human pose estimation, face detection, face recognition; emotion recognition, speech recognition, speaker diarization, paralinguistic attribute detection and text understanding.
Hyperbolic neural networks have achieved considerable success in extracting representation from hierarchical or tree-like data. They have become an important tool for applications in natural language processing, recommendation systems and social network analysis. However, it is difficult to build hyperbolic neural networks with deep hyperbolic layers, no matter which coordinate system is used. Moreover, hyperbolic neural networks often lack expressivity or miss important local geometric information. This project studies novel methods and neural network architectures for extracting expressive and robust hyperbolic representations in a stable manner. Applications include molecular generation, network analysis and graph anomaly detection.
Machine learning (ML) and Deep neural networks (DNNs) have achieved great success in many applications. However, recent research investigations show that DNNs are vulnerable on small perturbations of input data or have low generalizations over new samples, making them less trustable to be applied in real scenarios. In this research, we aim to establish a systematic framework to build up robust ML models that enjoy excellent generalization abilities on future/unseen data. Both theoretical and practical explorations will be conducted from the perspectives of adversarial training and data augmentation. So far, our research has been published in premier ML journals/conferences including ICML, WWW, ICCV, NeurIPS, and AAAI.
Stroke is the leading cause of death in China. A dataset from Shanxi Province is analyzed to predict the risk of patients at four states (low/medium/high/attack) and to estimate transition probabilities between various states via a SHAP DeepExplainer. To handle the issues related to an imbalanced sample set, the quadratic interactive deep model (QIDeep) was first proposed by flexible selection and appending of quadratic interactive features. The experimental results showed that the QIDeep model with 3 interactive features achieved the state‑of‑the‑art accuracy 83.33%(95% CI (83.14%; 83.52%)). Blood pressure, physical inactivity, smoking, weight, and total cholesterol are the top five most important features. For the sake of high recall in the attack state, stroke occurrence prediction is considered an auxiliary objective in multi‑objective learning. The prediction accuracy was improved, while the recall of the attack state was increased by 17.79% (to 82.06%) compared to QIDeep (from 71.49%) with the same features. The prediction model and analysis tool in this paper provided not only a prediction method but also an attribution explanation of the risk states and transition direction of each patient, a valuable tool for doctors to analyze and diagnose the disease.
Medical imaging modalities such as MRI, CT and PET provide visualization and quantification of different disease and health properties of human body. The quality of medical images can directly influence the accuracy of disease diagnosis and treatment. One of recent research projects in Dr. Lei Zhang’s lab is on the development and evaluation of novel image processing methods in minimizing MRI metal artifacts of human brain. By combining three types of MRI images, the underlying biophysics properties of human brain (T1, T2, and PD) can be determined and further used for synthesizing metal artifact resistive high contrast MRI images. Two aspects of the brain image quality: metal artifact and image contrast are simultaneously improved. This brain image quality enhancement can contribute to the enlargement of applicable population of brain MRI study, and the higher accuracy and precision of brain disease diagnosis and treatment.
To ensure high quality and yield, today’s advanced manufacturing systems are equipped with thousands of sensors to continuously collect measurement data for process monitoring, defect diagnosis and yield learning. In particular, the recent adoption of Industry 4.0 has promoted a set of enabling technologies for low-cost data sensing, processing and storage of manufacturing process. While a large amount of data has been created by the manufacturing industry, statistical algorithms, methodologies and tools are immediately needed to process the complex, heterogeneous and high-dimensional data in order to address the issues posed by process complexity, process variability and capacity constraint. The objective of this project is to explore enormous opportunities for data analytics in the manufacturing domain and provide data-driven solutions for manufacturing cost reduction.
Reliability, Availability, and Serviceability (RAS) are the core competencies of cloud services. As one of the fastest-growing components in von Neumann’s computer architecture, the memory component plays an important role. As a key component that directly provides data cache to the central processor, the failure of the memory system will directly cause the processor to stop responding, or even the system to crash. In this project, we proposed a new uncorrectable error (UCE) prediction algorithm based on the Correctable Error (CE) spatial-temporal information by using the machine learning method. Under the test of data from the public cloud with 3113 servers’ log information, the recall is 20% higher than the current industrial and academic results with the same precision. At the same time, by combining cluster analysis and physical mechanism, physical fault detection and risk region localization strategy are also proposed. Finally, a DRAM fault simulator is established to study the RAS of DRAM.
In recent years, data-driven intelligent transportation systems (ITS) have developed rapidly and brought various AI-assisted applications to improve traffic efficiency. However, these applications are constrained by their inherent high computing demand and the limitation of vehicular computing power. Vehicular edge computing (VEC) has shown great potential to support these applications by providing computing and storage capacity in close proximity. For facing the heterogeneous nature of in-vehicle applications and the highly dynamic network topology in the Internet-of-Vehicle (IoV) environment, how to achieve efficient scheduling of computational tasks is a critical problem. Accordingly, we design a two-layer distributed online task scheduling framework to maximize the task acceptance ratio (TAR) under various QoS requirements when facing unbalanced task distribution. Briefly, we implement the computation offloading and transmission scheduling policies for the vehicles to optimize the onboard computational task scheduling. Meanwhile, in the edge computing layer, a new distributed task dispatching policy is developed to maximize the utilization of system computing power and minimize the data transmission delay caused by vehicle motion. Through single-vehicle and multi-vehicle simulations, we evaluate the performance of our framework, and the experimental results show that our method outperforms the state-of-the-art algorithms. Moreover, we conduct ablation experiments to validate the effectiveness of our core algorithms.
Vehicle fuel efficiency (VFE) has a pivotal role in solving energy shortage issue due to the increasing global demand for energy. The high frequency of go-stop movements and long waiting times at intersections significantly reduce the VFE. Such negative impacts are particularly severe when the traffic flows are regulated by poorly designed traffic signal control. Existing works have successfully applied deep reinforcement learning (DRL) techniques to improve the efficiency of traffic signal control. However, to the best of our knowledge, few studies have explored traffic signal control for VFE through eco-driving techniques. To fill the gap, we propose a DRL-based fuel-economic traffic signal control for improving vehicle fuel efficiency. Briefly, we adopt the DRL-technique to develop an agent that can efficiently control traffic signals based on real-time traffic information at intersections, and adjust speed profiles for approaching vehicles to smooth traffic flows. We tested our method on both synthetic traffic dataset and real-world traffic dataset from surveillance cameras in Toronto. Through comprehensive experiments, we demonstrate that our method surpassed the performance of both pure eco-driving and pure traffic signal control techniques by significantly reducing vehicle fuel consumption and improving the efficiency of traffic signal control.
In recent years, increasing concern about safe driving has stimulated the development of Intelligent Transportation Systems (ITS). One of the critical technologies for realizing ITS is sharing traffic information between vehicles in real-time. Due to the inherently unstable network topology in the vehicular network environment, achieving reliable Vehicle-to-Everything (V2X) communication becomes a challenging research direction. Aiming at this critical issue, we design a cloud computing-assisted traffic-aware data sharing protocol to improve the inter-vehicles data transmission performance. Briefly, by assuming a heterogeneous vehicular-cellular network environment, vehicles can utilize the cellular network’s low latency and high reliability to request the real-time public traffic information of a large area through the cloud. Based on the real-time public traffic information of the current area, i.e., the traffic density, the source vehicle can calculate a reliable data transmission route for delivering data packets to the destination vehicle. We further verify the effectiveness of the proposed method in packet delivery by conducting simulation experiments and comparing the simulation result with other data routing protocols.
In this project, close collaboration is made with leading enterprises in domestic industry. By using the sales and logistics data, we provide customers with guidance on pricing and discounts on all category of products. The project is combined with new retailing, using data-driven methodology for all aspects from production to sales, and providing advice on enterprise data management.
0512-36657577
No.8 Duke Avenue, Kunshan, Jiangsu, China, 215316