Python Learning the Design Principles of Dictionary Data Structures
Conclusion first:
Python’s dict uses a hash table with open addressing (specifically quadratic probing + pseudo-random hopping) under the hood.
Core Structure: A hash table is essentially an “array”.
Each dict corresponds to an underlying array (called table). Each slot in this array may hold a key-value pair.
text
index: 0 1 2 3 4 5 6 7 value: [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]
Any input key is hashed (hash(key)).
python
d["abc"] = 123
# Python calculates h = hash("abc") → gets an integer, e.g., 918273645
# slot = h % table_size → e.g., 918273645 % 8 = 5
Thus, the key is placed in slot 5.
text
index: 0 1 2 3 4 5 6 7
value: [ ] [ ] [ ] [ ] [ ] [('abc',123)] [ ] [ ]
If the calculated slot is already occupied, the dict does not use chaining (like a linked list). Instead, it finds the next empty slot using Open Addressing.
text
slot 6 empty? → Put here. slot 6 also taken? → Check slot 7. ...
The dict controls the “load factor” (usage ratio). For example, if slot usage exceeds ~2/3, it automatically expands (resizes). After expansion, there is much more space, collisions become rarer, thus dict performance remains O(1).
Resizing operation:
- The
tablesize is doubled. - All keys are rehashed and placed into the new table (rehash).
Considering specific application scenarios: Lookup and insertion operations in a dict have O(1) time complexity on average.
1. Lookup Process
Looking up d[key]:
- Calculate
hash(key). - Locate the slot:
slot = hash % table_size. - Check if the key in that slot is the target key.
- Yes → Found.
- No → Continue probing using the open addressing rule.
Therefore, the efficiency of this process depends on the underlying principle of the hash algorithm. The hash function generally involves bitwise operations + randomization.
2. Practical Code Examples
python
# 1. Create an empty dict
adict = {}
# 2. Insert a key-value pair
# The key is hashed, a slot is found (potentially via probing), and the pair is stored.
adict[key] = value
# 3. Check if a key exists
if i not in adict: # Checks if 'i' is a key in 'adict'
# This operation involves calculating hash(i) and probing the table.
pass
Related articles