Different chemical space and appropriate validation+good datasets to explore

What is currently considered to be the best approach when we are training our model in one chemical space, that is supposed to evaluate test data that is in different chemical space? Also, are there any recommended analysis for determining the similarities of chemical spaces between training and test set?

Additionally, I saw that aircheck has many different target protein datasets to offer, but we are unable to determine what datasets are of the best quality and what targets are the most important, preferably something that has complete molecular information as opposed to only fingeprints?

Lastly, our results tended to be too optimistic when we evaluated on training data, because test data is in chemical space that is very different from training. Are there any ways to determine the prediction accuracy more realistic before doing the experimental validation?

Thank you for your time

New member, Luka Lisica