← Back to projects

Security Research

Adversarial Keystroke Authentication

Designed an Android application and SQLite framework for secondary authentication data collection. Applied keystroke dynamics analysis to captured timing data to approximate likely user passwords - demonstrating a practical adversarial attack on behavioral biometric authentication.

Year 2023
Role Researcher
Status Research / Demo
Android SQLite Keystroke Dynamics Statistical Analysis Behavioral Biometrics Security Research

Overview

Keystroke dynamics, using the timing of how you type as a behavioral biometric, has been proposed as a secondary authentication factor. The idea is that even if an attacker knows your password, they can't reproduce your unique typing rhythm.

This research project tests that assumption adversarially.

The Attack Model

The threat model assumes:

  • The attacker can observe raw keystroke timing data (e.g., via a malicious keyboard app, a compromised authentication endpoint, or physical access to timing logs)
  • The attacker does not know the password in advance
  • The goal is to recover a usable approximation of the password that also passes the behavioral authentication check

Data Collection

I built an Android application that acts as a keyboard instrumentation layer, recording:

  • Key-down and key-up timestamps for each keystroke (to millisecond precision)
  • Inter-key intervals (the gap between key-up of one key and key-down of the next)
  • Dwell times (how long each key is held)
  • The key identity itself (stored separately from timing, simulating different attacker access levels)

Records are written to a local SQLite database with a schema designed for efficient time-series queries across sessions.

Analysis

The statistical pipeline (Python, post-collection) builds a probabilistic model of each user's typing profile:

  1. Cluster inter-key intervals by bigram (consecutive key pair) - different bigrams have distinct timing distributions
  2. Model each distribution as a Gaussian; estimate μ and σ from the training corpus
  3. Score candidate inputs by their likelihood under the model - inputs that match the timing profile score higher

For password approximation without knowing the content, the system uses the timing profile to constrain a search over likely password candidates (common passwords, dictionary words), scoring each candidate's expected typing signature against the captured profile.

Findings

The attack is most effective against users with highly consistent typing patterns (ironically, the users for whom keystroke dynamics authentication is most reliable as a biometric). Users with high variance in their typing are harder to spoof but also harder to authenticate in the first place, which displays a fundamental tension in behavioral biometric design.

Additionally we found that layout randomized keypads drew in more mistakes, longer dwell times, and keystroke speed, leading to a wildly different keystroke signature per different layout. This makes it harder to determine a password based on profiling and statistical analysis in comparison to other samples showing it to be more secure to user privacy.

This work underscores that behavioral biometrics should be treated as a risk signal, not a secret - they are useful for anomaly detection but should not be the sole authentication gate.