Home Automated Large Language Model Evaluation & APLCOLLAB
Post
Cancel

Automated Large Language Model Evaluation & APLCOLLAB

While at JHU APL, I developed a communication script between GPT 3.5 Turbo and Dolly 12B or Stable Vicuna 13B allowing them to engage in a directed dialog designed to eliminate manual testing of large language models by 15 staff, saving hundreds of hours of manual effort.

I also built a standardized machine learning environment called APLCOLLAB that includes PyTorch and TensorFlow using Docker for use by all APL employees. I established a thorough dynamic Gitlab CI/CD pipeline that ensures stability of the environment across a variety of GPUs.

This post is licensed under CC BY 4.0 by the author.