Machine Detectives

Developing intelligent systems that decide which information to acquire, balancing the cost of gathering data against the benefit of improved predictions.

Active Feature Acquisition workflow diagram

Project overview

A framework for active feature acquisition that learns which features to request on a per-instance basis, minimizing acquisition costs while maintaining prediction accuracy.

  • Sequential feature acquisition based on instance-specific needs
  • Balances cost of acquiring data with expected prediction improvement
  • Nonparametric oracle-based approach for feature selection decisions
  • Outperforms reinforcement learning and surrogate model alternatives

The active feature acquisition problem

In many real-world scenarios, not all features are free to acquire.

Imagine a healthcare application where diagnosing a patient requires multiple tests—each test costs money, takes time, and carries some risk. The ideal system wouldn't demand every test upfront; instead, it would intelligently ask for just the tests needed to reach a confident diagnosis. Active feature acquisition solves this by deciding, for each individual case, which features to request next, weighing whether the expected improvement in prediction quality justifies the cost of acquisition.

Key challenges

Previous approaches struggled with fundamental tradeoffs in machine learning.

Sparse Rewards Complex Action Spaces Multidimensional Distributions Joint Feature Dependencies

Reinforcement learning methods faced difficulties training policies due to sparse reward signals and complex decision spaces. Surrogate models struggled to capture the intricate, multidimensional conditional distributions needed. Greedy approaches failed to account for how acquiring multiple features together could provide richer information than acquiring them sequentially.

The Acquisition Conditioned Oracle (ACO) approach

A novel nonparametric method that sidesteps the limitations of prior approaches.

Nonparametric Methods Oracle-Based Learning Acquisition Conditioning Joint Feature Selection

The ACO framework uses an oracle-based approach conditioned on the set of acquired features. Rather than modeling complex conditional distributions or training RL policies, it queries an oracle given the current feature set—making acquisition decisions that naturally account for feature dependencies and joint informativeness. This elegant design avoids many of the training difficulties that plagued alternative methods.

Results and applications

The ACO method demonstrates superior performance across diverse acquisition scenarios.

Extensive experiments show the ACO outperforms state-of-the-art active feature acquisition methods for both prediction tasks and general decision-making scenarios. The framework proves particularly valuable in domains like healthcare, where acquisition costs are high and feature dependencies matter—systems that adaptively request only necessary tests save time, money, and reduce patient burden.