Daily Coding Problem #3: Binary tree (de)serialization
Goal: serialize and deserialize a binary tree and node
By my quick search and understanding, a "binary tree" is one node with two attached to it
Serializing one is to convert it into a format that can be stored without changing the tree's structure.
Deserializing is to restore the stored version to its live data structure, with matching structure as pre-serialization.
Constraints:
The node class is as given:
class Node:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right
The following should work out:
node = Node('root', Node('left', Node('left.left')), Node('right'))
assert deserialize(serialize(node)).left.left.val == 'left.left'
So to make a little diagram of the above constraint test tree
o
/ \
o o
/
o
So the premise is that we are passing the node to serialize() to get Modified_form_X,
and then passing the Modified_form_X to deserialize() to get the original node,
making the following semantically viable:
class NodeWrapper:
def __init__(self, myNode, extra):
self.val = myNode.val
self.left = myNode.left
self.right = myNode.right
self.extra = extra
def serialize(n):
Modified_node = NodeWrapper(n, 'xyz')
return Modified_node
def deserialize(mn):
Original_node = Node(mn.val, mn.left, mn.right)
return Original_node
But we might as well turn it into an array of bytes as if to prepare it for storage
The word 'serialize' makes me think of adding extra data, so that we can recreate the sequence even if it is completely disorganized (like the volume number on the spine of a manga mook: we can reorder them properly by this even if we arrange them randomly). So that's what I'll do. I'll even randomize it :v
def serialize(n):
# turn node tree into an array for easier handling
ary = []
returnForRight = []
cur = n # we know n is the root
while cur != 'Done':
while cur != None: # log lefts first. then return for right later
ary.append(nodeByte(cur)) # log the node as a byte
if cur.right != None:
returnForRight.append(cur.right)
cur = cur.left # can be None or another node
# cur now is None, so try to get the next right
if len(returnForRight) > 0:
cur = returnForRight[0]
returnForRight.pop(0) # remove the first element, instead of keeping track of progress with an index of returnForRight
else:
cur = 'Done' # finally hit all of the branches
# randomize to prove a point that nobody asked for
out = []
nodesLen = len(ary)
left = nodesLen
for inc in range(nodesLen):
idx = ran(0,left-1)
out.append(ary[idx])
ary.pop(idx)
left -= 1
# now we have random serialized nodes from the tree
# so return them
return out
def serialize(n):
# turn node tree into an array for easier handling
ary = []
returnForRight = []
cur = n # we know n is the root
while cur != 'Done':
while cur != None: # log lefts first. then return for right later
ary.append(nodeByte(cur)) # log the node as a byte
if cur.right != None:
returnForRight.append(cur.right)
cur = cur.left # can be None or another node
# cur now is None, so try to get the next right
if len(returnForRight) > 0:
cur = returnForRight[0]
returnForRight.pop(0) # remove the first element, instead of keeping track of progress with an index of returnForRight
else:
cur = 'Done' # finally hit all of the branches
# randomize to prove a point that nobody asked for
out = []
nodesLen = len(ary)
left = nodesLen
for inc in range(nodesLen):
idx = ran(0,left-1)
out.append(ary[idx])
ary.pop(idx)
left -= 1
# now we have random serialized nodes from the tree
# so return them
return out
def deserialize(x): # x is the byte array from serialize()
# first find the root
# then find nodes with trailSize of only 1 (level 1 children)
# then trailSize of only 2 (level 2 children)
# theoretically: etc
maxLv = getMaxNodeLevel(x) + 1 # ending idx includes value of max lv
lvls = []
for lv in range(maxLv):
lvls.append(findAllNodesOfLevel(x, lv))
# now we have [ [root], [root.left, root.right], [root.left.left] ] though not necessarily in that order
#now let's transform them into Node's
root = Node('root')
nodeLvls = [[root]] # like lvls but Node's instead of bytes
# now let's go through each level starting with lv 1 and attach them to the proper parent
for lv in range(maxLv)[1:]:
nodeLvls.append([])
for idx in lvls[lv]: # note: for each level, each node's parent would have already been defined.
# find this node's parent within the upper level & within the root Node
myParent = matchTrail(nodeLvls[lv-1], idx)
if myParent == None:
print('unable to match a parent to this byte: ' + str(idx))
else:
if localRelativity(idx) == True: # true=right? or false=left
myParent.right = Node(stringifyTrail(idx)) # with left/right as None, its children will claim it l8r
nodeLvls[lv].append(myParent.right) # log the Node for its kids to find
else:
myParent.left = Node(stringifyTrail(idx))
nodeLvls[lv].append(myParent.left)
# now we have the root Node, which has had children nodes attached to it
return root # return the root Node
# [min,max]
def ran(min, max):
return min + (round(random()*(max-min)))
nodeByte format description:
0000 0000
^^^ trail size [0,7] inclusive (effective values [0,4], as there are only 4 bits to descsribe the trail)
^ is root? (set, but unused)
|--| the trail; left-to-right, 0 is 'left' and 1 is 'right'
Done
Well, that took around 3 hours lol
The final working code in Python (as it was the language the problem used):
from random import seed # seed for & generate random numbers
from random import random
class Node:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right
def nodeByte(n): #we're gonna wrap the node data into a single byte (we know there won't be many levels of the tree)
x = 0
if n.val == "root":
return 128 # root is just 128
trail = n.val.split('.') # 'left.left' => [left,left]
trailSize = 0
for lr in trail:
if lr == 'right':
x += (8 >> trailSize)
trailSize += 1
if trailSize == 4: # only room for 4 trail bits, the problem's example has a max trail of only 2 anyway
break
x += (trailSize & 7) << 4
return x
def findAllNodesOfLevel(nodes, Lv): # for ease in deserialize(). get all nodes with trailSize=Lv
a = []
for byte in nodes:
if ((byte & 112)>>4) == Lv:
a.append(byte)
return a
def localRelativity(nodeByte): # am i left or right? (in relation to my direct parent node)
return ((nodeByte & 8) >> (4 - ((nodeByte & 112)>>4))) == 1 # True = right, False = left
def getMaxNodeLevel(nodes):
max = 0
for byte in nodes:
lv = ((byte & 112)>>4)
if lv > max:
max = lv
return max
def matchTrail(nodes, byte):
myTrail = stringifyTrail(byte)
trailSize = (byte & 112) >> 4
if trailSize == 1:
myTrail = 'root' # is a direct child of the root node
else:
if localRelativity(byte): # check if left/right. We turn myTrail: left.left into left, to match parent lineage
myTrail = myTrail[:-6] # - str '.right'
else:
myTrail = myTrail[:-5] # - str '.left'
for n in nodes:
if n.val == myTrail:
return n
return None
def stringifyTrail(byte):
o = ''
trailSize = (byte & 112) >> 4
if trailSize == 0:
return o
if (byte & 8) == 8:
o += 'right'
else:
o += 'left'
if trailSize > 1:
curTrailIdx = 1
while curTrailIdx < trailSize:
if (byte & (8 >> curTrailIdx)) > 0:
o += '.right'
else:
o += '.left'
curTrailIdx += 1
# done. turned 0101 into 'left.right.left.right', etc
return o
def serialize(n):
# turn node tree into an array for easier handling
ary = []
returnForRight = []
cur = n # we know n is the root
while cur != 'Done':
while cur != None: # log lefts first. then return for right later
ary.append(nodeByte(cur)) # log the node as a byte
if cur.right != None:
returnForRight.append(cur.right)
cur = cur.left # can be None or another node
# cur now is None, so try to get the next right
if len(returnForRight) > 0:
cur = returnForRight[0]
returnForRight.pop(0) # remove the first element, instead of keeping track of progress with an index of returnForRight
else:
cur = 'Done' # finally hit all of the branches
# randomize to prove a point that nobody asked for
out = []
nodesLen = len(ary)
left = nodesLen
for inc in range(nodesLen):
idx = ran(0,left-1)
out.append(ary[idx])
ary.pop(idx)
left -= 1
# now we have random serialized nodes from the tree
# so return them
return out
def deserialize(x): # x is the byte array from serialize()
# first find the root
# then find nodes with trailSize of only 1 (level 1 children)
# then trailSize of only 2 (level 2 children)
# theoretically: etc
maxLv = getMaxNodeLevel(x) + 1 # ending idx includes value of max lv
lvls = []
for lv in range(maxLv):
lvls.append(findAllNodesOfLevel(x, lv))
# now we have [ [root], [root.left, root.right], [root.left.left] ] though not necessarily in that order
#now let's transform them into Node's
root = Node('root')
nodeLvls = [[root]] # like lvls but Node's instead of bytes
# now let's go through each level starting with lv 1 and attach them to the proper parent
for lv in range(maxLv)[1:]:
nodeLvls.append([])
for idx in lvls[lv]: # note: for each level, each node's parent would have already been defined.
# find this node's parent within the upper level & within the root Node
myParent = matchTrail(nodeLvls[lv-1], idx)
if myParent == None:
print('unable to match a parent to this byte: ' + str(idx))
else:
if localRelativity(idx) == True: # true=right? or false=left
myParent.right = Node(stringifyTrail(idx)) # with left/right as None, its children will claim it l8r
nodeLvls[lv].append(myParent.right) # log the Node for its kids to find
else:
myParent.left = Node(stringifyTrail(idx))
nodeLvls[lv].append(myParent.left)
# now we have the root Node, which has had children nodes attached to it
return root # return the root Node
# [min,max]
def ran(min, max):
return min + (round(random()*(max-min)))
node = Node('root', Node('left', Node('left.left')), Node('right'))
assert deserialize(serialize(node)).left.left.val == 'left.left'
print('Success! U haz smartz!')
No comments:
Post a Comment