Product Quantization: Performance vs Memory Trade-offs
Analysis of 1536-dimensional vectors with overquery factor = 5. Point size represents recall quality.
Key Trade-offs
16 subspaces: 384× compression, but 60% recall loss
64 subspaces: 96× compression, 10% recall loss
192 subspaces: 32× compression, full recall maintained
Performance Reality
• Aggressive compression (16 subspaces) doesn't improve query time
• Overquery factor of 5× needed to compensate for quality loss
• Best compression comes with 90% recall degradation
• Usable configurations (192 subspaces) still provide 32× compression
Insight: The "sweet spot" for Product Quantization isn't about maximum compression,
but finding the balance between memory savings and acceptable recall. For production systems,
64-192 subspaces often provide the best trade-off between compression ratio and search quality.