With the proliferation of application specific accelerators, the use of heterogeneous clusters is rapidly increasing. Consisting of processors with different architectures, a heterogeneous cluster aims at providing different performance and cost tradeoffs for different types of workloads. In order to achieve peak performance, software running on heterogeneous cluster needs to be designed carefully to provide enough flexibility to explore its variety. We propose a design methodology to modularize complex software applications with data dependencies. The software application designed in this way have the flexibility to be reconfigured for different hardware platforms to facilitate resource management, and features high scalability and parallelism. Using a neuromorphic application as a case study, we present the concept of modularization and discuss the management, scheduling and communication of the modules. We also present experimental results demonstrating the improvements and effects of system scaling on throughput.