Optimization algorithms like Gradient Descent serve as the engine for Machine Learning, iteratively adjusting model weights to minimize prediction error. While mathematical theory provides rigorous "upper bounds" on how quickly these algorithms should converge, implementation on real-world datasets often encounter numerical hurdles that theory ignores. We investigate this divergence by comparing the empirical performance of a Logistic Regression model trained using a Patient Survival dataset, against its formal mathematical proofs. We focus on “learning-rate” as the primary variable influencing stability and efficiency. Two distinct factors are monitored: convergence of the loss function, and geometric movement of weights through the search space. The overlaying of theoretical convergence curves onto the observed data can identify algorithmic behavior drift from predicted outcomes. Our empirical study results indicate that as the learning rate approaches a critical threshold, the model experiences oscillations that violate the smooth convergence guaranteed by most convex optimization proofs. We present a rigorous comparison of how mathematical ideals hold up under varying hyperparameters, offering a framework for selecting settings that balance computational efficiency with mathematical reliability – a critical factor in domains like healthcare, cybersecurity, and fraud detection. Future work will increase research depth by incorporating additional predictors (categorical and non-categorical) for training and assessing a high-dimensional model.