Recently, Corpus-based Text-To-Speech (CB-TTS) has been actively studied through the world, for the improvement of synthesized speech heading to human-like naturalness. However, the application of TTS is very restricted due to its large database(DB) size. In this paper, to solve this problem, we propose two modified algorithms of LBG clustering algorithm (split k-means). We introduce a terminating threshold of total cost in the first modification. The number of selected inventories becomes less than target cluster number if total cost reduction is enough to end iteration process. Considering frequency information of unit instances, which is obtained during synthesizing large text corpus, makes the second modification. To consider frequency information we proposed modified cost function of MinMax commonly used in selecting centroids. To evaluate the proposed method, we compared synthesized speech qualities of two modified LBG clustering algorithms with that of original DB. After reducing the DB to almost same size, we performed perceptual tests with some test sentences. From the perceptual test results, we can observe that our algorithm achieves the successful performance with reducing most the DB size and maintaining good speech quality.