A unknown structure is used to create an

A key design consideration within the approach was the single vs multi-label classificationparadigm. While multi-label classifications have proven efficacy, even surpassing human per-formance (Russakovsky et al. 2015), they are a significantly more complex endeavour. Conse-quently they demand substantial amounts of data and a deep CNN architectures; as such theywere deemed prohibitively costly procure and to train and were not included within the design.In place of a multi-label classification approach the same network was reusedn2?ntimes(where n is the number of dimensions in the data frame), with a single image supplied to theCNN per possible directed edge in the graph, and hence performing the prediction of the BNstructure in aggregate. This was considered a legitimate approach as the assessment of thepresence of the edge should be performed in the same manner, irrespective of the edge in ques-tion. Unfortunately it also treated the presence of edge’s as independent of one another, whichis not strictly true.While this involved more extensive scripting, it also had the benefit of constraining the prob-lem to a series of binary classifications, which can be reasonably expected to improve theCNN performance on a given volume of data. It also had the benefit of performing a pseudo-augmentation of the training data, withn2?ndifferent permutations of the data available persynthetic BN.In order to further simplify the required classification task, a discrete CNN was trained perBN node count (i.e. one network from 4 to 10 dimensions). This is illustrated graphically infigure 3.4.1 below (with image creation defined in 3.5.1) :Figure 3.4: A database with a BN of unknown structure is used to create an image per possibledirected edge. Collectively these are used to predict the overall structure of the BN via a CNN* This step is only performed when a structure prediction is required.Training datawas shuffled such that no sequential pattern that represented the underlying data set waspresent (beyond the shuffled distributions)