@notzed Valid questions. Here are some things that went into the decision process:
1.) We wanted an ABI that could stay with us for versions of the Epiphany core with only 16 or 32 registers.
2.) We did not optimize a lot for deeply function calls anyway. (no push/pop multiple instructions)
3.) Our initial focus was high performance floating point "inner loops".
4.) Initial optimization was more about performance than code density (regretting this somewhat now..)
That being said, there may be some room here for improvement (Embecosm also suggested a better approach).
Andreas