Publications

Refusal in LLMs is an Affine Function

Thomas Marshall, Adam Scherlis, Nora Belrose (2024): We propose affine concept editing (ACE) as an approach for steering language models' behavior by intervening directly in activations. We begin with an affine decomposition of model activation vectors and show that prior methods for steering model behavior correspond to subsets of terms of this decomposition. We then provide a derivation of ACE and test it on refusal using Llama 3 8B and Hermes Eagle RWKV v5. ACE ultimately combines affine subspace projection...
Read more

Intelligent Digital Agents in the Era of Large Language Models

B Faught, H Lu, T Marshall, H Sikka, P Guruprasad, B Gauri (2024): In recent years, the emergence of large language models (LLMs) has revolutionized the field of artificial intelligence, showcasing remarkable proficiency in natural language understanding and generation. This advancement has spurred a growing research area focused on the development of LLM-based autonomous agents, aiming to achieve human-like decision-making capabilities...
Read more

Jill Watson: A Virtual Teaching Assistant powered by ChatGPT

Karan Taneja, Pratyusha Maiti, Sandeep Kakar, Pranav Guruprasad, Sanjeev Rao, Ashok K. Goel (2024): Conversational AI agents often require extensive datasets for training that are not publicly released, are limited to social chit-chat or handling a specific domain, and may not be easily extended to accommodate the latest advances in AI technologies. This paper introduces Jill Watson, a conversational Virtual Teaching Assistant (VTA) leveraging the capabilities of ChatGPT. Jill Watson based on ChatGPT requires no prior training...
Read more

Does Transformer Interpretability Transfer to RNNs?

Gonçalo Paulo, Thomas Marshall, Nora Belrose (2024): Recent advances in recurrent neural network architectures, such as Mamba and RWKV, have enabled RNNs to match or exceed the performance of equal-size transformers in terms of language modeling perplexity and downstream evaluations, suggesting that future systems may be built on completely new architectures. In this paper, we examine if selected interpretability methods originally designed for transformer language models will...
Read more

PIILO: an open-source system for personally identifiable information labeling and obfuscation

L Holmes, S Crossley, H Sikka, W Morris (2023): This study aims to report on an automatic deidentification system for labeling and obfuscating personally identifiable information (PII) in student-generated text.
Read more

Designing a Communication Bridge between Communities: Participatory Design for a Question-Answering AI Agent

J Lee, V Nandan, H Sikka, S Rugaber, A Goel (2023): How do we design an AI system that is intended to act as a communication bridge between two user communities with different mental models and vocabularies? Skillsync is an interactive environment that engages employers (companies) and training providers (colleges) in a sustained dialogue to help them achieve the goal of building a training proposal that successfully meets the needs of the employers and employees...
Read more

Deidentifying Student Writing with Rules and Transformers

L Holmes, SA Crossley, W Morris, H Sikka, A Trumbore (2023): As education increasingly takes place in technologically mediated settings, it has become easier to collect student data that would be valuable to researchers. However, much of this data is not available due to concerns surrounding the protection of student privacy. Deidentification of student data is a partial solution to this problem, but student-generated text, a form of unstructured data, is a major challenge for deidentification strategies...
Read more

Human-AI Interaction Design in Machine Teaching

K Taneja, H Sikka, A Goel (2022): Machine Teaching (MT) is an interactive process where a human and a machine interact with the goal of training a machine learning model (ML) for a specified task. The human teacher communicates their task expertise and the machine student gathers the required data and knowledge to produce an ML model. MT systems are developed to jointly minimize the time spent on teaching and the learner's error rate. The design of human-AI interaction in an MT system not only impacts the teaching efficiency, but also indirectly influences the ML performance...
Read more

Reface: Real-time adversarial attacks on face recognition systems

S Hussain, T Huster, C Mesterharm, P Neekhara, K An, M Jere, H Sikka, (2022): Deep neural network based face recognition models have been shown to be vulnerable to adversarial examples. However, many of the past attacks require the adversary to solve an input-dependent optimization problem using gradient descent which makes the attack impractical in real-time. These adversarial examples are also tightly coupled to the attacked model and are not as successful in transferring to different models...
Read more

Explanation as Question Answering based on a Task Model of the Agent's Design

A Goel, H Sikka, V Nandan, J Lee, M Lisle, S Rugaber (2022): We describe a stance towards the generation of explanations in AI agents that is both human-centered and design-based. We collect questions about the working of an AI agent through participatory design by focus groups. We capture an agent's design through a Task-Method-Knowledge model that explicitly specifies the agent's tasks and goals, as well as the mechanisms, knowledge and vocabulary it uses for accomplishing the tasks. We illustrate our approach through the generation of explanations in Skillsync...
Read more

A framework for interactive knowledge-aided machine teaching

K Taneja, H Sikka, A Goel (2022): Machine Teaching (MT) is an interactive process where humans train a machine learning model by playing the role of a teacher. The process of designing an MT system involves decisions that can impact both efficiency of human teachers and performance of machine learners. Previous research has proposed and evaluated specific MT systems but there is limited discussion on a general framework for designing them. We propose a framework for designing MT systems and also detail...
Read more

Agent Smith: Machine Teaching for Building Question Answering Agents.

AK Goel, H Sikka, E Gregori (2022): Building AI agents can be costly. Consider a question answering agent such as Jill Watson that automatically answers students' questions on the discussion forums of online classes based on their syllabi and other course materials. Training a Jill on the syllabus of a new online class can take a hundred hours or more. Machine teaching - interactive teaching of an AI agent using synthetic data sets - can reduce the training time because it combines the advantages of knowledge-based AI, machine learning using...
Read more

Benchmarking Differentially Private Residual Networks for Medical Imagery

S Singh, H Sikka, S Kotti, A Trask (2020): In this paper we measure the effectiveness of - Differential Privacy (DP) when applied to medical imaging. We compare two robust differential privacy mechanisms: Local-DP and DP-SGD and benchmark their performance when analyzing medical imagery records...
Read more

WeightScale: Interpreting Weight Change in Neural Networks

AM Agrawal, A Tendle, H Sikka, S Singh (2021): Interpreting the learning dynamics of neural networks can provide useful insights into how networks learn and the development of better training and design approaches. We present an approach to interpret learning in neural networks by measuring relative weight change on a per layer basis and dynamically aggregating emerging trends through combination of dimensionality reduction and clustering which allows us to scale to very deep networks...
Read more

Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change

AM Agrawal, A Tendle, H Sikka, S Singh, A Kayid (2021): Understanding the per-layer learning dynamics of deep neural networks is of significant interest as it may provide insights into how neural networks learn and the potential for better training regimens. We investigate learning in Deep Convolutional Neural Networks (CNNs) by measuring the relative weight change of layers while training...
Read more

A Genetic Algorithm Based Approach for Satellite Autonomy

S Sikka, H Sikka (2021): Autonomous spacecraft maneuver planning using an evolutionary algorithmic approach is investigated. Simulated spacecraft were placed into four different initial orbits. Each was allowed a string of thirty delta-v impulse maneuvers in six cartesian directions, the positive and negative x, y and z directions. The goal of the spacecraft maneuver string was to, starting from some non-polar starting orbit, place the spacecraft into a polar, low eccentricity orbit...
Read more

Multimodal Modular Meta-Learning

H Sikka, A Tendle, A Kayid (2020): Many real world prediction problems involve structured tasks across multiple modalities. We propose to extend previous work in modular meta learning to the multimodal setting. Specifically, we present an algorithmic approach to apply task aware modulation to a modular meta learning system that decomposes structured multimodal problems into a set of modules that can be reassembled to learn new tasks...
Read more

Creating, Managing, and Understanding Large, Sparse, Multitask Neural Networks

H Sikka (2020): One of the popular directions in Deep Learning (DL) research has been to build larger and more complex deep networks that can perform well on several different learning tasks, commonly known as multitask learning. This work is usually done within specific domains, e.g. multitask models that perform captioning, translation, and text classification tasks...
Read more
Great! You’ve successfully signed up.
Welcome back! You've successfully signed in.
You've successfully subscribed to Manifold Research Group.
Your link has expired.
Success! Check your email for magic link to sign-in.
Success! Your billing info has been updated.
Your billing was not updated.