-
Notifications
You must be signed in to change notification settings - Fork 11
Fast sine approximation
Based on the sse_mathfun_sincos function.
The cephes library function that this is based on calculates sine or cosine using a Taylor polynomial with 4 terms (7th or 8th order poly because alternate coefficients are 0 and therefore ignored). This has reasonably accuracy to the limits at -π, +π. But it has remarkable accuracy between -π/4, π/4. The downside of using this limited range is 3 sets of flipping to extend a pair of calculations across all 8 of the 45° segments that can be mapped.
If we don't need that much accuracy we could use a simpler scheme where we just calculate the entire range from -π to +π with a single poly (or both if we want to do cosines at the same time) A 6 term poly (11th or 12th order) has a maximum error of less that 4.66e-4 at the limit.
Furthermore rather than calculated sin(x) where x is in radians. We can calculate sin(2πx) where x is in turns. We can recalculate the coefficients of the polynomials to factor in 2π for x. This has the advantage that we can modulate the input phase to 0 ≤ x < 1. This keeps the phase in line with the non-trigonometric waveforms (square, saw, tri) working directly from a modf modulator. We avoid the need to multiply by 2π, and swapping out the polynomial coefficient constants one for another has no net cost.
After modulation, we subtract 0.5f from the input value x. This shifts the range of the calculation to -0.5 ≤ x < 0.5 turns, (π ≤ x < π radians), and compensate for that by flipping the result of the calculation, or better yet flipping the sign of the coefficients.
At a limit of π/4, the error for 4 terms is no more that the magnitude of the 5th term evaluated at the limit. For the sine calculation that upper error bound is 3.13362E-07.
If we want to acheive the same low error level at -π ≤ x < π as is acheived for -π/4 ≤ x < π/4, then we need to include terms upto the 15th or 16th order coefficient. I.e 8 terms. The magnitude of the 9th term at π is 7.95205E-07
Given that the SSE algorithm uses 4 terms and calculates both the sine and cosine polynomials to 4 terms. It's not clear that it is more efficient than calculating just sine to 8 terms.
The first 25 coefficients (2ⁿπⁿ/n!) are:
Modified Coefficients
6.28318530717959e0
1.97392088021787e1
4.13417022403998e1
6.49393940226683e1
8.16052492760750e1
8.54568172066937e1
7.67058597530614e1
6.02446413718766e1
4.20586939448976e1
2.64262567833744e1
1.50946425768230e1
7.90353637131847e0
3.81995258484828e0
1.71439071108867e0
7.18122301778500e-1
2.82005968455791e-1
1.04229162208140e-1
3.63828411425457e-2
1.20315859421206e-2
3.77983420068004e-3
1.13092374825180e-3
3.22991067207098e-4
8.82353359924300e-5
2.30999569450704e-5
5.80565240294990e-6
And adding in a scaling term to bring the result to a -5v/+5v range the first 25 coefficients are
5V Coefficients
3.14159265358979e1
9.86960440108936e1
2.06708511201999e2
3.24696970113341e2
4.08026246380375e2
4.27284086033469e2
3.83529298765307e2
3.01223206859383e2
2.10293469724488e2
1.32131283916872e2
7.54732128841149e1
3.95176818565923e1
1.90997629242414e1
8.57195355544336e0
3.59061150889250e0
1.41002984227896e0
5.21145811040699e-1
1.81914205712728e-1
6.01579297106031e-2
1.88991710034002e-2
5.65461874125898e-3
1.61495533603549e-3
4.41176679962150e-4
1.15499784725352e-4
2.90282620147495e-5