A new approach to modeling neural networks in table-network databases.
[This is a translation of an article I published on www.medium.com in a series of posts about the table-network data model. See links to all posts here.]
HyperTable Management System – HTMS is designed for universal use. One of the subject areas where the features of the base for HTMS – table-network data model correspond to it as adequately as possible, are neural networks¹. The neural network is a directed, weighted graph.
As a basic neural network model, I will use a multilayer perceptron MultyLayer Percehtron – MLP² with one hidden layer.
The theory and practice of MLP is excellently presented in a series of articles by Robert Keim on the basic theory and structure of the known topology of neural networks (see Russian translation). It also contains the text of the Python Python Code for MLP Neural Networks.py program, which implements two main stages of training a neural network – training itself, which consists in selecting weights for activating the hidden layer and activating the output node (vector non-linear optimization problem) and validation (checking ) – determining the probability that the neural network will produce the correct output value for an arbitrary combination of input values.
The Python Code for MLP Neural Networks.py program was used by me as a prototype for creating software (also in Python) using the HTMS system as a DBMS. The program contains the following main components:
The main program is mlp.py
create_db.py is a module that defines the main classes for storing MLP models in a tabular network DBMS. The module uses the middle-level HTMS API. Contains a description of two hypertables:
mlp is a database (hypertable) for storing perceptron models with the following tables:
Start — catalog of perceptrons in the database;
Input – storage of input nodes;
Hidden – storage of hidden layer nodes;
Output – storage of output nodes of perceptrons.
train – a database (hypertable) for storing data samples for training and validation, i.e. datasets of input and output values:
Training – table of sample data for training;
Validation – table of data samples for validation
load_train_valid.py – a module with the function of loading samples of input data for training and validating a neural network – from Excel tables to DBMS tables (as class instances for training and validation), as well as for creating empty tables to store attribute values of Start, Input class instances, Hidden and output. The module uses the middle-level HTMS API.
mlp_train_valid.py is a function that reads data from the Training and Validation tables, trains the neural network, validates it, and writes the resulting MLP to the database. The module uses the object-level HTMS API for training and validation, and uses the middle-level HTMS API to store the perceptron in the database.
mlp_load_exec.py is a module (MID-level HTMS API) with two functions:
mlp_load_RAM – for reading from the database of a specific SME model;
mlp_execute – to get the MLP output value for any set of inputs
mlp_par.py is a module with parameters and a logistic function and a derivative of the logistic function. Below are screenshots from the HTMS hypertable editor, which is part of the software system.
Below are screenshots from the HTMS hypertable editor, which is part of the software system.
General database structure for storing perceptrons (neural networks). The database has 4 tables (Start, Input, Hidden and Output) and 13 attributes:
The Start table is a catalog of perceptrons stored in the database. In the example, there are 5 neural networks with 3 input and 3 hidden nodes. StoI field – stores a set of simple row references in the Input table, where each row corresponds to one input node. In addition to the input nodes, each perceptron has a special bias node – BiasI, so each perceptron is assigned 4 rows. The Correctness field contains the results of the perceptron validation:
In this form, the editor shows the detailed contents of the entire field for one of the rows in the table. In particular, here you can see that the StoI link field in the 1st row of the Start table contains 4 links – to the 1st, 2nd, 3rd and 4th row of the Input table:
The contents of the table to store the input nodes of all perceptrons. The ItoH field contains a set of weighted (numbered) row references in the Hidden table, where each row corresponds to one node in the hidden layer:
The detailed contents of the entire field for one of the rows in the Input table. In particular, here you can see that the ItoH link field in the 1st row of the table contains 3 weighted links – to the 1st, 2nd and 3rd rows of the Hidden table – with weights -0.03, +0.8 and -0.07 respectively:
Table contents for storing hidden nodes of all perceptrons. The HtoO field contains a weighted (numbered) reference to a row in the Output table, where each row corresponds to one output node:
The detailed contents of the entire field for one of the rows in the Hidden table. In particular, here you can see that the HtoO links field in the 1st row of the table contains a weighted link – to the 2nd row of the Output table – with a weight of -0.60:
As an additional example of another perceptron database (with 3 input nodes and 5 nodes in the hidden layer), the contents of its Start table are shown:
An example of a multilayer perceptron with 3 input nodes and 3 nodes in the hidden layer in an HTMS tabular network database. Image generated by HTMS editor (with Graphviz visualization package):
Since the algorithms and data structures of the code fragments for validation in the mlp_train_valid.py module and for using the perceptron in the mlp_load_exec.py module are almost identical, this example shows how HTMS tools can be used at the middle level and object level to solve the same task.