#xgboost4j predict fail in some devices
31 messages · Page 1 of 1 (latest)
Can you explain more about how did you split the data and such?
I don't use Java, so I can't really read through the code, but understanding your diagram and flow is no problem
For example, there could be data leakage problem, there could be data imbalance problem, and so on
If your model is only good at predicting class A, and your test set on server B is mostly filled with class A, then you will certainly have a higher result there
Pruning these logic is what I'm looking at
sorry maybe i not declare it clear.
I train xgboost and infer it using xgboost4j.
my issue is: with the same input, but model predict different result
in local: model always predict a prob > 0.99, but when i deploy it to server B. it run correctly and output prob = 0.02
i also deploy to server A. but it run into the same problem with local . output prob always > 0.99
firstly i think its have some issue with incorrect library load, but, i try another model (model X2 with fewer features) , then do the same step above.
its work correctly in all device
That's strange. I also just found this: https://github.com/dmlc/xgboost/issues/4562 and it's an open ticket. If your stuff works perfectly fine without xgboost4j, then might be xgboost4j itself then
@pallid vault wonder if you have thoughts on this
I think i might use xgboost4j one day. Other than that, I could only offer tech support questions atm.
Is it loading the same weights? Does the server have bias enabled/disabled, if applicable? My first step would be checking to see if the network on the server is actually the same as the one local
it use the same weights, and load from file locally
just come in mind that something wrong happen when model is bigger and related to ram resource
Is there a debug way to check manually, or are you just confident the server reads the same as local?
but im not sure how to check
i try another model in the same server, same service, and all the same just another bin 🙂
and amazingly its work, the model that not work, is work in server B. only fail in local and server A
Could you try a different model that successfully runs locally? See if it is only this project or if it is related to them all
yeah the model X2, work everywhere. I test in the same project.
model X1, fail in local and server A, work in server B
p/s: actually model X1 is upgrade of X2 with some new features btw
@pallid vault you're an angel
I'm trying to read more about xgboost to see if I can find smth
tks you so much, this take me 2 days search around, and cant find any clue
The only thing that stands out to me atm is the different forms of saving the bin, to include feature maps or not
The Boost class also seems to contain a Booster#dump_model() which according to the docs
Dump model into a text or JSON file. Unlike save_model(), the output format is primarily used for visualization or interpretation, hence it’s more human readable but cannot be loaded back to XGBoost.
I am not 100% certain it exists in the java version, but it looks like it could definitely show if the model is the same internally or if there are some minor differences
the problem in my case is its work with previous version, it work in server B. But what i want is work in server A 😄
My stupid solution now is make an api to call predict from serverA to server B, its so stupid 😦 poor me
i train by python
XGB Version 1.6.1
model.save_model(f'{OUTPUT_DIR}/prod26_{PRJ_NAME}_model.bin')
then save it to bin
and use xgboost4j to load back to infer realtime
I have to go soon, so hopefully something here has helped a bit
tks you for your help . You are so nice
I am the default java helper here lol, Ian calls me occasionally