Since accurately predicting stock return sequences can improve the performance of portfolio optimization models, the results have indicated that machine learning methods have a greater capacity to confront problems with nonlinear, nonstationary charateristics than econometric models. Consequently, a novel two-stage method is proposed for well-diversified portfolio construction based on stock return prediction using machine learning, which includes two stages. To be specific, the purpose of the first stage is to select diversified stocks with high predicted returns, where the returns are predicted by machine learning methods, i.e. eXtreme Gradient Boosting(XGBoost), support vector regression(SVR), K-Nearest Neighbor(KNN), and evaluate and select the model. In the second stage, considering the constraints such as transaction costs and threshold constraints, the predictive results are incorporated into the mean semi-variance (M-SV) model, mean-variance model and equally weighted model to determine optimal portfolio. Finally, using China Securities 300 Index component stocks as study sample, the empirical results demonstrate that the XGBoost+MSV model achieves better results than similar counterparts and market index in terms of return and return-risk metrics.