cancel
Showing results for 
Search instead for 
Did you mean: 

Important ML Toolkit Changes

cmccarthy1
New Contributor III
New Contributor III
For anyone that uses the KX Machine Learning repos (Toolkit/AutoML/NLP) the following is relevant. Added to this group as the implications of the change may be seen anywhere these repos or select functionality is used extensively.

Today we released  an update to the ML Toolkit(v3.0.0).

The vast majority of changes and additions to this are non breaking and as such a more detailed breakdown of additional functionality can be found in the release notes for the repo. However some of the changes may be particularly disruptive to some users, we’ve attempted as best we can to limit this disruption.

For context these changes have been made with consideration for future development in mind and a move towards syntax which is more intuitive for users from other languages (in the case of the change to the model calling functionality in particular).

Function naming:
Functions added to the Toolkit in versions < 2.0 had used function names which were completely lowercase. This had some benefits when dealing with a small number of functions, however as the number of sub namespaces increased within the repo newer code transitioned to use lowerCamelCase in line with a number of company internal standards.This disconnect has been resolved within the Toolkit and a number of the NLP functions with all functions now using lowerCamelCase. We have endeavoured to make sure that all top level functions are still callable using the old lowercase version by adding deprecation warnings as follows to state that these will be deprecated following version 3.0

q).ml.onehot[([]10?`a`b);::]
Future Deprecation Warning: function will no longer be callable after version '3.0'. Please use '.ml.oneHot.fitTransform' instead.
x_a x_b
-------
1  0
1  0
1  0
0  1
1  0

This however has only been extended to the top level documented functions. Any users making use of the internal functions within the .ml.fresh.feats namespace for example will not receive this warning and the code will fail.

Model calling syntax:
The following is also of note, users who are using the models within the toolkit will previously have fit and used models for prediction as follows

q)show mdl:.ml.clust.kmeans.fit[d;`e2dist;3;::]
reppts| (3.058613 3.86148;0.9750445 1.324305;4.146106 0.8723815)
clt | 0 1 0 0 2 2 0 2 1 0
data | (3.885652 0.4113437 2.566009 2.473914 4.332783 3.207488 4.541356 4.89..
inputs| `df`k`iter`kpp!(`e2dist;3;10;1b)
// make a prediction
q).ml.clust.kmeans.predict[data;mdl]
0 2 0 1 2

This is no longer the case with users expected to call models as follows

q)show mdl:.ml.clust.kmeans.fit[d;`e2dist;3;::]
modelInfo| `repPts`clust`data`inputs!((0.216295 0.42723;0.8909372 0.4027502);..
predict | {[config;data]
config:config[`modelInfo];
data:clust.i.floatCo..
update | {[config;data]
modelConfig:config[`modelInfo];
data:clust.i.fl..
q)mdl.predict[data]
0 2 0 1 2

This is to align the function calling syntax with a more Pythonic/Scikit-learn method of fitting and applying models.

Anyone that is concerned about the implications of these changes or has questions around upgrading to align with the new functionality please feel free to email ai@kx.com and we'll help as best we can to guide users upgrading to the newer version
0 REPLIES 0