Welcome to Research Log #022! We document weekly research progress across the various initiatives in the Manifold Research Group, and highlight breakthroughs from the broader research community we think are interesting in the Pulse of AI!
We’re growing our core team and pursuing new projects. If you’re interested in working together, join the conversation on Discord and check out our Github.
NEKO
The NEKO Project aims to build the first large scale, Open Source "Generalist" Model, trained on numerous modalities including control and robotics tasks. You can learn more about it here.
- Language+Vision: We have successfully merged the Captions branch to main! This means that we can focus on the final stages of Visual Question Answering. You can read more through here.
- Language: Bhavul has changed our main dataset from WikiText to OpenWebText and is planning to run some new training runs. This change could take NEKO to the same level of performance as GPT-2! If you want to contribute to the test runs, look into the repo.
- Datasets: The team has been discussing a final list for the datasets we could use for the Vision/Language tasks. We have narrowed it down to some potential candidates, which can be found in the benchmark proposal tab of this sheet. If you have additional datasets you believe may be helpful to the project, please let us know!
Agent Forge
The AgentForge Project aims to build models, tools, and frameworks that allow anyone to build much more powerful AI agents capable of using tools and interacting with the digital and physical worlds.
- Agent Survey: The AgentForge Team is continuing with our Agent Survey, and we are updating our contributor process so the community can more easily get involved. The survey can be found here. If you would like to contribute to this survey, please feel free to join the conversation on our Discord!
- Function Calling Datasets: We are also exploring the currently existing function calling datasets, and looking into increasing coverage. Function calling is the first step in building models capable of longer sequences of actions using APIs. We'll have more on the longer term vision of this project to share soon.
Pulse of AI
There have been some exciting advancements this week!
- Octo - An Open-Source Generalist Robot Policy: With the release of the Open-X dataset, researchers have been eagerly waiting to train models on it. Now this is a reality with Octo. Researchers at Berkley, Stanford and CMU have teamed up to generate two new models for robotics: Octo-Small and Octo-Base. They have more information on their official release webpage.
- Phi-2: A tiny model which packs a punch, released by Microsoft. This new model has only 2.7 billion parameters, but it can match models up to 25 times its size. They kept the recipe of using textbooks and other high quality sources to train the model. Microsoft released the weights to the public and had a release blog post.
- FunSearch: People thought that Large Language Models (LLMs) were incapable of generating new knowledge. Now researchers at Google DeepMind have demonstrated that it is possible to make LLMs explore new ideas. By prompting a model and saving the answers it generated into a database, after this they evaluated the code and grabbed the best answers as the basis for new prompts. If you want to look more into it, read their post.
Stay tuned for next week where we discuss NeurIPS!
If you want to see more of our updates as we work to explore and advance the field of Intelligent Systems, follow us on Twitter, Linkedin, and Mastodon!