I am interested in the potential of AI to accelerate scientific investigations across disciplines through data-driven insights, predictive modeling, and intelligent automation. My current interests span both the application of AI/ML to real-world scientific problems and the development of novel AI methods that are robust, interpretable, and informed by physical principles.
I am especially excited about multi-modal and physics-informed approaches that integrate diverse sources of information—such as experimental data, simulations, and domain knowledge—into unified learning frameworks. These hybrid methods hold great promise for advancing our understanding of complex systems in areas like materials science, chemical imaging, and quantum dynamics.
Beyond applications, I am equally interested in exploring fundamental challenges in AI, including learning under uncertainty, generalization from limited data, and interpretable decision-making. My long-term goal is to develop AI systems that not only perform well in controlled settings but also translate effectively into real-world scientific and engineering contexts.
Below are some of my works in these directions:
https://arxiv.org/abs/2509.10753
Large Language Models can reason, write, and problem-solve at remarkable levels—but they also hallucinate, generating plausible-sounding content that's factually wrong. This limits their use in healthcare, legal analysis, and scientific research where accuracy is non-negotiable.
HalluField treats LLM responses like thermodynamic systems. By analyzing the "energy landscape" of token generation, we detect hallucinations as unstable states—the same way a physicist identifies instability in physical systems.
1. Token Paths as Energy States
Each possible response sequence is modeled as a trajectory with associated energy—high-probability paths have low energy (stable), low-probability paths have high energy (unstable).
2. Temperature Perturbation
We probe the model by varying the temperature parameter. Reliable answers maintain stable energy distributions; hallucinations show erratic, sensitive behavior.
3. Variational Analysis
Grounded in the first law of thermodynamics, we quantify stability through energy-entropy variations—no external fact-checking required.
No fine-tuning — Works directly on any LLM's output logits
No auxiliary models — No additional neural networks needed
No external knowledge bases — Self-contained detection
Real-time capable — Minimal computational overhead
Framework: Field-theoretic + variational principle
Input: Model logits (temperature-perturbed)
Output: Hallucination probability score
Variants: HalluField (full), HalluFieldSE (entropy-based)
Dependencies: None (post-hoc analysis)
HalluField opens new research directions at the intersection of physics and trustworthy AI—demonstrating that principles from thermodynamics can solve fundamental reliability challenges in machine learning.
Optics Communications, Volume 507, 15 March 2022, 127691.
ChemSpecNet is a deep learning framework I developed to bring modern computer vision and machine learning techniques into the field of chemical imaging. It was designed to address a core limitation of Sum Frequency Generation (SFG) spectroscopic imaging: the need for spatial averaging (pixel binning) to overcome low signal-to-noise ratios, which traditionally comes at the cost of spatial resolution.
SFG imaging is a uniquely powerful method for probing surface chemistry, but its weak signals often demand long acquisition times or heavy post-processing. Conventional methods like spectral curve fitting break down in noisy environments and are computationally expensive. ChemSpecNet tackles this challenge by reimagining the problem as a spectral classification task. It uses a supervised neural network to directly identify chemical signatures from noisy pixel-level spectra, enabling high-resolution imaging without compromising detail or speed.
Trained on over a million spectra from Self-Assembled Monolayers (SAMs) on gold substrates, ChemSpecNet achieves:
92% classification accuracy at the single-pixel level (no binning)
Up to 99.5% accuracy using minimal 8×8 binning
Robust generalization across experimental variations
Full-resolution, real-time chemical mapping without the need for long acquisition times
Technically, ChemSpecNet is built as a fully connected neural network for Hyper-Spectral imaging using TensorFlow, with:
Input: mid-IR SFG spectra with 71 wavenumbers per pixel
Outputs: Chemical identities of pixels and chemical maps for full image.
This project shows the power of data-driven models in domains traditionally governed by physics-based approaches. ChemSpecNet opens new possibilities for fast, high-resolution chemical imaging in materials science, nanotechnology, and biomedical sensing—setting a new standard for applying machine learning in hyperspectral and spectroscopic imaging.
Image generated from 1x1 binning using ChemSpecNet
Image generated from 8x8 binning using ChemSpecNet
Metallic nanoparticles exhibit plasmonic oscillations—collective electron resonances that produce remarkable optical properties exploited in metasurfaces, nanophotonics, light conversion, and telecommunications. Despite their technological importance, designing particles with targeted optical responses remains largely empirical. Full-wave electromagnetic simulations are computationally prohibitive, particularly for nonlinear optical phenomena where simulation costs increase by orders of magnitude.
This project develops an AI-driven surrogate modeling framework that replaces expensive numerical solvers with fast, differentiable neural networks—enabling both rapid prediction and gradient-based inverse design of nanoparticle geometries.
Approach
Shape Encoding via Convolutional Autoencoder
Arbitrary particle geometries (sub-400 nm, represented as bitmap images with fixed orientation) are compressed into a compact latent space using a convolutional autoencoder. The training set emphasizes shapes with broken inversion symmetry, a requirement for second-order nonlinear response dictated by symmetry selection rules in nonlinear optics.
Spectra Prediction Network
The latent representation feeds a regression network that predicts the full nonlinear optical response: second-harmonic generation (SHG), sum-frequency generation (SFG), and third-harmonic generation (THG)—including both intensity and phase in the far field. The surrogate model accepts material identity (Au, Ag, Al), embedding medium (air or glass), and assumes broadband excitation at normal incidence with defined polarization.
Inverse Design via Latent Space Optimization
The differentiable surrogate enables gradient-based optimization directly in the learned latent space to identify geometries producing a desired spectral response. Fabrication constraints—minimum feature size and angular limits—are incorporated into the optimization to ensure physically realizable designs. The low-fidelity surrogate also serves as a fast explorer of the parameter space before committing to expensive validation simulations.
Technical Details
Particle size: < 400 nm
Input representation: Bitmap images, fixed lab-frame orientation
Materials: Au, Ag, Al
Environment: Air or glass substrate
Excitation: Normal incidence, polarized, broadband visible pulse
Target responses: SHG, SFG, THG (intensity & phase, far-field)
Ground truth solver: FDTD
Dataset: ~1000 shapes, biased toward broken inversion symmetry
Training: Sequential (autoencoder → predictor)
Symmetry constraints: Informed by nonlinear optics selection rules
Impact
This framework reduces spectral prediction from hours of FDTD computation to milliseconds of neural network inference, making high-throughput screening and on-the-fly optimization tractable. By embedding physical symmetry constraints and fabrication tolerances directly into the pipeline, the approach bridges the gap between computational design and experimental realization of functional nonlinear plasmonic nanostructures.
Network Architecture for Surrogate modeling of non-linear optical response from plasmonic nano-particles.
Faithful reconstruction (right-column) of input particle shapes (left column) from auto-encoder.
Nuclear spin-spin coupling constants (J-couplings) encode critical structural and conformational information in NMR spectroscopy, serving as sensitive probes of through-bond connectivity and molecular geometry. While density functional theory (DFT) provides accurate predictions of these quantum mechanical observables, the computational cost scales unfavorably with molecular size, limiting high-throughput screening and real-time spectral interpretation. This creates a compelling case for machine learning surrogates that can approximate DFT-level accuracy at a fraction of the computational cost.
This project leverages graph neural networks to learn the mapping from molecular structure to J-coupling constants. Molecules are naturally represented as graphs—atoms as nodes, bonds as edges—with node and edge features encoding chemical identity, hybridization, and local electronic environment. Message-passing layers propagate information across the molecular topology, learning representations that capture the multi-bond pathways through which nuclear spins interact. Trained on DFT-computed coupling constants, the model provides rapid inference of J-couplings for arbitrary organic molecules, enabling applications in automated structure elucidation, conformational analysis, and virtual screening where traditional quantum chemical calculations would be prohibitively expensive.
Digitizing legacy chemical databases—scanned NMR spectra, handwritten lab records, and archived publications—requires optical character recognition that extends far beyond standard text extraction. Chemical documents contain a heterogeneous mix of entity types: compound names with complex IUPAC nomenclature, molecular formulas with subscripts and superscripts, numerical property values with units, and SMILES strings encoding full molecular topology in linear notation. Extracting these reliably demands domain-aware parsing that understands chemical semantics, not just visual pattern matching. Additionally, proprietary or sensitive datasets often preclude the use of cloud-based APIs, necessitating fully local inference pipelines.
This project implements an agentic OCR workflow using LangChain to orchestrate multi-step extraction and validation, powered by Qwen vision-language models running locally via Ollama. The agentic architecture enables iterative reasoning: the system can identify ambiguous regions, re-examine context, cross-check extracted SMILES for chemical validity, and reconcile inconsistencies between names and structures—capabilities beyond single-pass OCR. Local deployment satisfies data privacy and security requirements while maintaining competitive extraction accuracy. The pipeline outputs structured records linking compound identifiers, molecular representations, and associated spectroscopic or physicochemical properties, ready for downstream cheminformatics workflows.