Thursday, February 16, 2023

Coding Challenge #4 Binary Tree (De)Serialization

 Daily Coding Problem #3: Binary tree (de)serialization

 
Goal: serialize and deserialize a binary tree and node
By my quick search and understanding, a "binary tree" is one node with two attached to it
Serializing one is to convert it into a format that can be stored without changing the tree's structure.
Deserializing is to restore the stored version to its live data structure, with matching structure as pre-serialization.

Constraints:
The node class is as given:
class Node:
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right


The following should work out:
node = Node('root', Node('left', Node('left.left')), Node('right'))
assert deserialize(serialize(node)).left.left.val == 'left.left'

So to make a little diagram of the above constraint test tree
    o
   / \
  o   o
 /
o

So the premise is that we are passing the node to serialize() to get Modified_form_X,
and then passing the Modified_form_X to deserialize() to get the original node,
making the following semantically viable:
class NodeWrapper:
    def __init__(self, myNode, extra):
        self.val = myNode.val
        self.left = myNode.left
        self.right = myNode.right
        self.extra = extra
def serialize(n):
    Modified_node = NodeWrapper(n, 'xyz')
    return Modified_node
def deserialize(mn):
    Original_node = Node(mn.val, mn.left, mn.right)
    return Original_node




But we might as well turn it into an array of bytes as if to prepare it for storage
The word 'serialize' makes me think of adding extra data, so that we can recreate the sequence even if it is completely disorganized (like the volume number on the spine of a manga mook: we can reorder them properly by this even if we arrange them randomly). So that's what I'll do. I'll even randomize it :v
def serialize(n):
    # turn node tree into an array for easier handling
    ary = []
    returnForRight = []
    cur = n # we know n is the root
    
    while cur != 'Done':
        while cur != None: # log lefts first. then return for right later
            ary.append(nodeByte(cur)) # log the node as a byte
            if cur.right != None:
                returnForRight.append(cur.right)
            cur = cur.left # can be None or another node
        # cur now is None, so try to get the next right
        if len(returnForRight) > 0:
            cur = returnForRight[0]
            returnForRight.pop(0) # remove the first element, instead of keeping track of progress with an index of returnForRight
        else:
            cur = 'Done' # finally hit all of the branches
    
    
    # randomize to prove a point that nobody asked for
    out = []
    nodesLen = len(ary)
    left = nodesLen
    for inc in range(nodesLen):
        idx = ran(0,left-1)
        out.append(ary[idx])
        ary.pop(idx)
        left -= 1
    
    # now we have random serialized nodes from the tree
    # so return them
    return out
def serialize(n):
    # turn node tree into an array for easier handling
    ary = []
    returnForRight = []
    cur = n # we know n is the root
    
    while cur != 'Done':
        while cur != None: # log lefts first. then return for right later
            ary.append(nodeByte(cur)) # log the node as a byte
            if cur.right != None:
                returnForRight.append(cur.right)
            cur = cur.left # can be None or another node
        # cur now is None, so try to get the next right
        if len(returnForRight) > 0:
            cur = returnForRight[0]
            returnForRight.pop(0) # remove the first element, instead of keeping track of progress with an index of returnForRight
        else:
            cur = 'Done' # finally hit all of the branches
    
    
    # randomize to prove a point that nobody asked for
    out = []
    nodesLen = len(ary)
    left = nodesLen
    for inc in range(nodesLen):
        idx = ran(0,left-1)
        out.append(ary[idx])
        ary.pop(idx)
        left -= 1
    
    # now we have random serialized nodes from the tree
    # so return them
    return out
    
    
def deserialize(x): # x is the byte array from serialize()
    # first find the root
    # then find nodes with trailSize of only 1 (level 1 children)
    # then trailSize of only 2 (level 2 children)
    # theoretically: etc
    
    maxLv = getMaxNodeLevel(x) + 1 # ending idx includes value of max lv
    lvls = []
    for lv in range(maxLv):
        lvls.append(findAllNodesOfLevel(x, lv))
    
    # now we have [ [root], [root.left, root.right], [root.left.left] ] though not necessarily in that order
    #now let's transform them into Node's
    root = Node('root')
    nodeLvls = [[root]] # like lvls but Node's instead of bytes
    
    # now let's go through each level starting with lv 1 and attach them to the proper parent
    for lv in range(maxLv)[1:]:
        nodeLvls.append([])
        for idx in lvls[lv]: # note: for each level, each node's parent would have already been defined.
            # find this node's parent within the upper level & within the root Node
            myParent = matchTrail(nodeLvls[lv-1], idx)
            if myParent == None:
                print('unable to match a parent to this byte: ' + str(idx))
            else:
                if localRelativity(idx) == True: # true=right? or false=left
                    myParent.right = Node(stringifyTrail(idx)) # with left/right as None, its children will claim it l8r
                    nodeLvls[lv].append(myParent.right) # log the Node for its kids to find
                else:
                    myParent.left = Node(stringifyTrail(idx))
                    nodeLvls[lv].append(myParent.left)
    
    # now we have the root Node, which has had children nodes attached to it
    return root # return the root Node

# [min,max]
def ran(min, max):
    return min + (round(random()*(max-min)))


nodeByte format description:
0000 0000
 ^^^ trail size [0,7] inclusive (effective values [0,4], as there are only 4 bits to descsribe the trail)
^ is root? (set, but unused)
     |--| the trail; left-to-right, 0 is 'left' and 1 is 'right'

 

Done


Well, that took around 3 hours lol
The final working code in Python (as it was the language the problem used):

from random import seed # seed for & generate random numbers
from random import random

class Node:
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def nodeByte(n): #we're gonna wrap the node data into a single byte (we know there won't be many levels of the tree)
    x = 0
    if n.val == "root":
        return 128 # root is just 128
    
    trail = n.val.split('.') # 'left.left' => [left,left]
    trailSize = 0
    for lr in trail:
        if lr == 'right':
            x += (8 >> trailSize)
        trailSize += 1
        if trailSize == 4: # only room for 4 trail bits, the problem's example has a max trail of only 2 anyway
            break

    x += (trailSize & 7) << 4
    return x
def findAllNodesOfLevel(nodes, Lv): # for ease in deserialize(). get all nodes with trailSize=Lv
    a = []
    for byte in nodes:
        if ((byte & 112)>>4) == Lv:
            a.append(byte)
    return a
def localRelativity(nodeByte): # am i left or right? (in relation to my direct parent node)
    return ((nodeByte & 8) >> (4 - ((nodeByte & 112)>>4))) == 1 # True = right, False = left
def getMaxNodeLevel(nodes):
    max = 0
    for byte in nodes:
        lv = ((byte & 112)>>4)
        if lv > max:
            max = lv
    return max
def matchTrail(nodes, byte):
    myTrail = stringifyTrail(byte)
    trailSize = (byte & 112) >> 4
    if trailSize == 1:
        myTrail = 'root' # is a direct child of the root node
    else:
        if localRelativity(byte): # check if left/right. We turn myTrail: left.left into left, to match parent lineage
            myTrail = myTrail[:-6] # - str '.right'
        else:
            myTrail = myTrail[:-5] # - str '.left'
    
    
    for n in nodes:
        if n.val == myTrail:
            return n
    return None
def stringifyTrail(byte):
    o = ''
    trailSize = (byte & 112) >> 4
    if trailSize == 0:
        return o
    
    if (byte & 8) == 8:
        o += 'right'
    else:
        o += 'left'
    
    if trailSize > 1:
        curTrailIdx = 1
        while curTrailIdx < trailSize:
            if (byte & (8 >> curTrailIdx)) > 0:
                o += '.right'
            else:
                o += '.left'
            curTrailIdx += 1
    
    # done. turned 0101 into 'left.right.left.right', etc
    return o
def serialize(n):
    # turn node tree into an array for easier handling
    ary = []
    returnForRight = []
    cur = n # we know n is the root
    
    while cur != 'Done':
        while cur != None: # log lefts first. then return for right later
            ary.append(nodeByte(cur)) # log the node as a byte
            if cur.right != None:
                returnForRight.append(cur.right)
            cur = cur.left # can be None or another node
        # cur now is None, so try to get the next right
        if len(returnForRight) > 0:
            cur = returnForRight[0]
            returnForRight.pop(0) # remove the first element, instead of keeping track of progress with an index of returnForRight
        else:
            cur = 'Done' # finally hit all of the branches
    
    
    # randomize to prove a point that nobody asked for
    out = []
    nodesLen = len(ary)
    left = nodesLen
    for inc in range(nodesLen):
        idx = ran(0,left-1)
        out.append(ary[idx])
        ary.pop(idx)
        left -= 1
    
    # now we have random serialized nodes from the tree
    # so return them
    return out
    
    
def deserialize(x): # x is the byte array from serialize()
    # first find the root
    # then find nodes with trailSize of only 1 (level 1 children)
    # then trailSize of only 2 (level 2 children)
    # theoretically: etc
    
    maxLv = getMaxNodeLevel(x) + 1 # ending idx includes value of max lv
    lvls = []
    for lv in range(maxLv):
        lvls.append(findAllNodesOfLevel(x, lv))
    
    # now we have [ [root], [root.left, root.right], [root.left.left] ] though not necessarily in that order
    #now let's transform them into Node's
    root = Node('root')
    nodeLvls = [[root]] # like lvls but Node's instead of bytes
    
    # now let's go through each level starting with lv 1 and attach them to the proper parent
    for lv in range(maxLv)[1:]:
        nodeLvls.append([])
        for idx in lvls[lv]: # note: for each level, each node's parent would have already been defined.
            # find this node's parent within the upper level & within the root Node
            myParent = matchTrail(nodeLvls[lv-1], idx)
            if myParent == None:
                print('unable to match a parent to this byte: ' + str(idx))
            else:
                if localRelativity(idx) == True: # true=right? or false=left
                    myParent.right = Node(stringifyTrail(idx)) # with left/right as None, its children will claim it l8r
                    nodeLvls[lv].append(myParent.right) # log the Node for its kids to find
                else:
                    myParent.left = Node(stringifyTrail(idx))
                    nodeLvls[lv].append(myParent.left)
    
    # now we have the root Node, which has had children nodes attached to it
    return root # return the root Node

# [min,max]
def ran(min, max):
    return min + (round(random()*(max-min)))



node = Node('root', Node('left', Node('left.left')), Node('right'))
assert deserialize(serialize(node)).left.left.val == 'left.left'
print('Success! U haz smartz!')

No comments:

Post a Comment

Coding Challenge #54 C++ int to std::string (no stringstream or to_string())

Gets a string from an integer (ejemplo gratis: 123 -> "123") Wanted to come up with my own function for this like 10 years ago ...