Real-world graphs, such as those arising from biological systems, social networks, and the topology of the observable web, require special treatment when analyzed using parallel and distributed algorithms. The irregularity, small-world nature, and overall massive scale common to such networks prove a challenge for algorithm designers and analysts that work with such networks. To ensure the performant running of analytic codes, special care needs to be taken to optimize at all levels of parallelization, including at the processor-level, node-level, and system-level. This work first identifies algorithmic structures common to a large number of parallel graph analytics. Using the noted structures, a set of optimizations are developed that for the various levels of optimizations. Accomplishments of this work include a general approach for analyzing multi-billion vertex graphs on a large-scale system, as well as demonstrated improvements over current state-of-the-art using the implemented optimizations.