Hassan Javed
AI Developer & Researcher

Hello, I'm

Hassan Javed

AI Developer and Researcher

I'm an AI Developer and Researcher with 4+ years of experience in Deep Learning, LLM fine-tuning, and Multimodal AI systems. I specialize in delivering impactful solutions in NLP, Computer Vision, and MLOps, with hands-on expertise in deploying production-grade AI applications.

My experience spans from developing distributed multi-GPU LLM inference systems to building RAG applications and digital ink recognition models. I have a proven track record in both academic and industrial R&D, including work in fast-paced startup environments.

Currently, I'm working as an AI Developer (Research Officer) at CENTAIC - NASTP and as an AI Researcher at Ink AI. I'm passionate about transforming complex challenges into elegant, production-ready AI solutions.

About Me

About Me

Hello! I'm Hassan Javed, an AI Developer and Researcher with 4+ years of experience in Deep Learning, LLM fine-tuning, and Multimodal AI systems. I have a proven track record in both academic and industrial R&D, including work in fast-paced startup and challenging environments.

I specialize in delivering impactful solutions in NLP, Computer Vision, and MLOps, with hands-on expertise in deploying production-grade AI applications. My experience spans from developing distributed multi-GPU LLM inference systems to building sophisticated RAG applications and handwriting synthesis models.

Currently, I'm working as an AI Developer (Research Officer) at CENTAIC - NASTP, while also serving as an AI Researcher at Ink AI. I'm passionate about pushing the boundaries of AI technology and transforming complex challenges into elegant, production-ready solutions.

Here are a few key technologies I work with:

Python JavaScript C++ TensorFlow PyTorch LLaMA Models GPT Models Qwen Models Computer Vision Deep Learning YOLOv8/v9 OpenCV LangChain LlamaIndex FastAPI Django Docker AWS GCP ChromaDB Qdrant Neo4j GraphRAG MLOps

Education

  • National University of Computing and Emerging Sciences (FAST) (Aug. 2022 – Feb. 2025)

    Master of Science in Data Science | CGPA: 3.17/4.00

    Research: Conducted research on Plant Species Identification using Vision Transformers (ViT) with metadata fusion, achieving 97.27% classification accuracy. Published in CMC: Computers, Materials & Continua Journal.
    Teaching Assistant - Web Programming (2022–2023): Supported BS students for 2 semesters with coding, debugging, and MERN Stack (MongoDB, Express.js, React.js, Node.js) development.

  • University Of Kotli (AJK) (Oct. 2017 – Nov. 2021)

    Bachelor of Science in Computer Science | CGPA: 3.86/4.00

    Final Year Project - Intelligent Waste Management System: Developed a custom-trained Mask R-CNN model for detecting garbage in open areas and dustbins, achieving 94.34% accuracy for each waste class.
    Key Achievements: Created custom dataset of 500+ images across 3 waste classes, implemented computer vision algorithms for volume measurement, and deployed using Django and REST APIs.

Research Experience

  • FAST National University of Computer and Emerging Science (Jan 2024 - Feb 2025)

    Graduate Researcher | Supervisor: Dr. Labiba Fahad, Ph.D

    Plant Species Identification with AI: Conducted research on advanced plant species identification by integrating Vision Transformer (ViT) models with metadata fusion, achieving 97.27% classification accuracy and 0.9842 Mean Reciprocal Rank.
    Multi-Modal Deep Learning Architecture: Developed a hybrid approach combining deep visual feature extraction with geographical and taxonomic metadata to enhance identification of morphologically similar species, significantly outperforming traditional CNN-based methods.
    Data Preprocessing and Analysis: Implemented comprehensive preprocessing pipeline with data augmentation techniques to improve model robustness across diverse environmental conditions, utilizing PyTorch for model development and evaluation.

  • University of Kotli (AJK) (Mar 2021 - Oct 2021)

    Undergraduate Researcher | Supervisor: Dr. Zahid Mehmood, Ph.D

    Intelligent Waste Management System: Developed a custom-trained Mask R-CNN model for detecting garbage in open areas and dustbins, achieving 94.34% accuracy for each waste class, enhancing environmental monitoring capabilities.
    Dataset Creation and Labeling: Created and meticulously labeled a custom dataset of approximately 500 images across 3 waste classes, establishing the foundation for accurate model training and validation.
    Volume Measurement Techniques: Implemented advanced computer vision algorithms for garbage volume measurement, enabling precise segmentation of waste areas to optimize collection schedules and resource allocation.
    Web Application Development: Deployed the trained model using Django framework and REST APIs, ensuring a robust and scalable application for real-time waste monitoring and management.

Professional Experience

  • Ink AI (Dec 2024 - Present)

    AI Researcher | Delaware, USA

    Digital Ink Recognition and Synthesis R&D: Conducted research in digital ink recognition with focus on handwriting generation using reference and target text. Trained and evaluated custom models to mimic user handwriting.
    Model Development and Accuracy Improvement: Designed and trained RNN, LSTM-based, Transformer-based, and Diffusion-based models on IAMonDB and Brush Online datasets. Achieved an accuracy improvement from 89% to 93% in handwriting generation.
    Cloud Deployment and Model Serving: Deployed trained models on Google Cloud Platform (GCP), configured training pipelines, and exposed endpoints for inference. Integrated deployed services into both web and Android applications.
    Algorithm Development: Designed and implemented custom algorithms tailored to specific project requirements, optimizing performance and accuracy without relying on off-the-shelf solutions.
    Semantic Similarity Search with Vector DBs: Built a notebook content search system using ChromaDB and OpenAI LLMs to enable context-aware retrieval through embedding-based similarity.
    System Integration and Backend Engineering: Integrated model APIs using FastAPI and RESTful interfaces. Supported deployment and service integration with ReactJS frontend and Kotlin-based components.
    MLOps and Infrastructure: Containerized services with Docker and utilized PyTorch and TensorFlow for training and serving models in production.
    Startup Contribution: Worked closely with the startup founded by Rich Miner (Co-founder of Android), contributing to R&D and operations.

  • CENTAIC – National Aerospace and Technology Park (NASTP) (Mar 2024 – Present)

    Artificial Intelligence Developer (Research Officer) | Rawalpindi, Pakistan

    Distributed Multi-GPU LLM Inference: Implemented distributed inference for the LLaMA 3.2 70B model across 4 NVIDIA Ada Gen 4500 GPUs using distributed-llama.cpp and vLLM. Built a streaming chat application that balanced workloads via Layer 3 switching, ensuring efficient GPU utilization and real-time token generation.
    Facial Recognition and Intruder Detection Pipeline: Developed and integrated end-to-end pipelines for facial recognition and face detection systems to support intruder detection. Optimized real-time inference and deployed solutions within existing security infrastructure.
    LLM Fine-Tuning and QA Systems: Fine-tuned LLaMA 3.2 and other foundation models (DeepSeek, Qwen2.5, GPT) using the QLoRA method on custom PDF datasets, building domain-specific question-answering systems.
    RAG System with Performance Boost: Developed a personalized RAG application using LangChain, Hugging Face, Qdrant, and Chrome integration. Improved answer accuracy from 70% to 80%+ through enhanced context retrieval techniques.
    Offline LLM Deployment and Inference: Deployed 70B GGUF LLMs locally using llama.cpp on A100 GPUs without internet. Integrated Open WebUI to enable RAG-based chatbot functionality in disconnected environments.
    AI Pipelines with Graph & Vector DBs: Engineered hybrid pipelines using LangChain and LangGraph, combining vector and graph databases for efficient, structured knowledge retrieval.
    Time Series Forecasting for Defense Systems: Created a synthetic dataset of 1 million records to train LSTM models for forecasting and anomaly detection. Boosted accuracy from 75% to 89%.
    Explainable AI in Anomaly Detection: Built Random Forest models with SHAP-based explainability to highlight key threat indicators for air defense systems.
    Computer Vision with YOLO: Trained YOLOv8 and YOLOv9 models on custom datasets and deployed them via Flask-based web apps for real-time object detection.
    Geospatial AI for Flight Pattern Simulation: Used QGIS and ML to analyze location-based data and simulate realistic flight paths using a hybrid of rule-based and machine learning systems.

  • PanaceaLogics (Nov 2023 – Mar 2024)

    Artificial Intelligence Research Executive | Rawalpindi, Pakistan

    NLP Solutions: Implemented NLP solutions utilizing LLMs, LayoutLMv2, LayoutLMv3, LiLTv2, OCR, and GPT-2 technologies.
    Document Extraction Systems: Developed chatbots and document extraction systems using BERT-based models and LangChain.
    Proof of Concept (POC) Development: Created POC applications with Flask demonstrating NLP capabilities using OpenAI APIs.
    Model Optimization: Applied RAG techniques to LLaMA2 and LLaMA3 models, optimized models using LORA and QLORA methods.
    LLM Fine-tuning: Fine-tuned LLMs for customized application needs and enhanced performance.
    Team Mentoring: Mentored and guided junior engineers and interns, fostering professional development in AI and NLP.

  • PanaceaLogics (Jul 2023 – Oct 2023)

    Artificial Intelligence Specialist | Rawalpindi, Pakistan

    Data Preparation: Collected, labeled, and preprocessed fish datasets for classification, segmentation, and object detection.
    Fish BMI and Monitoring Systems: Calculated fish BMI using image analysis and implemented paddle wheel detection for monitoring through stereo-vision techniques.
    PDF-based NLP Retrieval System: Developed a query-based search system across uploaded PDFs to improve data analysis and retrieval efficiency.

  • 2hm Inc. (Jan 2022 – Nov 2023)

    Data Scientist | El Dorado Hills, California, USA

    Disease Research & Classification: Conducted comprehensive research on chicken diseases, implemented advanced data augmentation techniques, and developed classification models for disease identification.
    Performance Evaluation System: Created a web-based performance evaluation application using Flask framework to assess model accuracy and system efficiency.
    Marketing Campaign Analytics: Performed extensive data mining on marketing campaign datasets, including data wrangling, exploratory data analysis, model building, and hyperparameter fine-tuning.
    Business Intelligence Solutions: Developed interactive PowerBI dashboards and created automated triggers for multiple database queries to enhance data visualization and analysis capabilities.
    Technical Documentation: Authored comprehensive technical reports documenting methodology, findings, and recommendations for stakeholders.

Publications

  • Enhanced Plant Species Identification using Metadata Fusion and Vision Transformers (2025)

    Authors: Hassan Javed, Dr. Labiba Fahad, et al. | Published

    Research Paper: Proposed a novel approach integrating Vision Transformer (ViT) models with metadata fusion to enhance plant species classification accuracy to 97.27%, addressing challenges in distinguishing morphologically similar species.
    Key Contributions: Developed an AI-driven methodology combining deep visual feature extraction with geographical and taxonomic metadata, demonstrating significant improvements over traditional CNN-based approaches. Full paper available at CMC: Computers, Materials & Continua Journal (TechScience).

Certifications

AI & Deep Learning
  • Neural Networks and Deep Learning DeepLearning.AI Jul 2024
  • Build Basic GANs DeepLearning.AI Jan 2024
  • TensorFlow for AI, ML & DL DeepLearning.AI Sep 2023
Machine Learning
  • Machine Learning Stanford University Mar 2021
  • Intermediate Machine Learning Kaggle Mar 2021
  • Intro to Machine Learning Kaggle Mar 2021
Data Science
  • What is Data Science? Coursera Jul 2023
  • Data Manipulation with Pandas DataCamp Dec 2022
  • Data Visualization Kaggle Apr 2022
Google & Analytics
  • Ask Questions to Make Data-Driven Decisions Google Jan 2023
  • Foundations: Data, Data, Everywhere Google Dec 2022
Programming
  • MTA: Programming Using Python Microsoft Jan 2021
  • Intermediate Python DataCamp Dec 2022
  • Intro to Programming Using Python DataCamp Dec 2022
Tools & Ethics
  • MATLAB Certified MathWorks Aug 2023
  • Ethics in the Age of Generative AI LinkedIn Jul 2023

Achievements & Awards

  • Industrial Exhibition (MUST) (2021)

    Control Automotive and Robotics Lab Participation Shield

    Recognition: Honored as the first student team from University of Kotli (AJK) to present our Intelligent Waste Management System at this prestigious event, receiving a participation shield for demonstrating innovative application of computer vision techniques.

Extracurricular Activities

  • CENTAIC – National Aerospace and Technology Park (NASTP) (June 2024 - Aug 2024)

    AI Education Lead, Mentorship Program | Rawalpindi, Pakistan

    Technical Instruction: Developed and delivered comprehensive 3-month curriculum on Machine Learning and Deep Learning fundamentals to a cohort of interns, covering both theoretical foundations and practical implementations.
    Mathematical Foundations: Guided students through essential mathematical concepts underpinning ML/DL algorithms, including linear algebra, calculus, probability theory, and statistical methods.
    Hands-on Programming: Facilitated intensive PyTorch workshops, enabling participants to implement neural networks from scratch and deploy working AI solutions.
    Knowledge Transfer: Bridged theoretical knowledge with practical applications, empowering the next generation of AI practitioners within Pakistan's developing aerospace technology sector.

Testimonials

What Clients Say

Latest Blog Posts

  • Llama.cpp is a lightweight and efficient implementation of LLaMA (Large Language Model Meta AI) that allows you to run models locally on your machine. In this guide, we will walk through the process of installing Llama.cpp on Ubuntu 22.04 LTS, downloading model weights in GGUF format, and running inference.

  • Large Language Models (LLMs) have revolutionized AI applications, enabling capabilities like text generation, reasoning, and automation. While cloud-based APIs such as OpenAI and Anthropic provide easy access to powerful models, running LLMs locally has become an attractive alternative for developers and enterprises looking for lower latency, privacy, and cost efficiency. Deploying LLMs locally, however, comes with its own set of challenges model optimization, hardware constraints, and inference speed being some of the key considerations. Fortunately, several frameworks and tools have emerged to simplify local LLM serving, each offering unique advantages in terms of efficiency, scalability, and ease of use.

  • For many AI applications, such as chatbots, summarization, extracting contextual information from long documents and data, and code creation, large language models (LLMs) like GPT, DeepSeek, LLaMA, Falcon, Qwen, and Mistral have become essential. However, there are many obstacles to effectively implementing and operating these models locally:

    • Large quantities of VRAM and GPU resources are needed to run LLMs.
    • Hugging Face Transformers and other standard inference frameworks frequently result in low throughput and significant latency.
    • Out-of-memory (OOM) issues occur because LLMs require a lot of memory to store activations.
    • It is challenging to manage several user requests at once without dynamic batching.
    • Traditional approaches do not support multi-GPU and multi-node scalability well.

    Effective LLM Inference and Deployment Optimization is a collection of methods and strategies intended to increase the large language model (LLM) deployment’s scalability, speed, and resource usage. This involves minimizing memory usage for managing large-scale models, maximizing inference speed to lower latency, and enabling effective GPU utilization to support numerous users at once. Assuring that LLMs can be implemented in real-world applications with reduced expenses, quicker reaction times, and enhanced scalability whether on local computers or cloud-based infrastructures is the aim. To overcome these difficulties, solutions such as vLLM and related frameworks introduce multi-GPU execution methods, batching tactics, and sophisticated memory management.

Let's Connect With Me

Email copied to clipboard!
To Top