Diffraction gratings and the diffraction limit

Topics not fitting anywhere else.
PinkysBrain

Diffraction gratings and the diffraction limit

Post by PinkysBrain »

What determines the line width (FWHM) limit of the lines produced by shining a coherent light beam through a diffraction grating? (Assuming you use an off axis diffraction order.)

I don't really see where the Abbe limit is supposed to come from in this situation.
Dinesh

Diffraction gratings and the diffraction limit

Post by Dinesh »

PinkysBrain wrote:What determines the line width (FWHM) limit of the lines produced by shining a coherent light beam through a diffraction grating? (Assuming you use an off axis diffraction order.)

I don't really see where the Abbe limit is supposed to come from in this situation.
I'm not sure exactly what your question is. Are you asking what the connection between the Abbe limit and a holographic grating is?

If so, then consider a collimated beam impinging on a grating at normal incidence(with only one spatial frequency in the grating). The grating produces diffracted beams that are angularly modulated. In fact, for normal incidence, a grating with a line spacing of d, the grating equation gives:

d = lambda/sin(theta)

If you take the +1 and -1 orders into account, the angular spread of the beams will be 2(theta). Clearly, the smaller d is, ie the larger the frequency of the grating, the larger is the angular spread. By the way, if the input beam is not at normal incidence as you mention, the grating equation becomes:

d = lambda/(sin(theta) +/- sin(phi))
which creates a slightly more complex analysis - but not by much

If a lens is placed to interact with these angularly modulated beams, then the back focal point of the lens produces the Fourier Transform - the Transform Plane. Continuing on, the beams then inverse-transform and image onto a screen. Now if the system is changed to light is passing through a transparency, then the wavefunction of the transmitted beam can be decomposed into a series of angularly modulated collimated beams (see, for example, Goodman's "Introduction to Fourier Optics"). Thus the output of the transparency - the transmitted beam - can be thought of as coming from a plane grating with multiple frequencies. However, the angular spread of the decomposed planar beams depends on the detail in the transparency, with the finer detail corresponding to a higher frequency in the grating. So, if the capturing lens is not wide enough, it "misses" the more extreme rays and therefore some detail is lost. The lens, in effect, acts like a low-pass filter.

The FWHM of a grating comes from a reflective grating, where the FWHM is now the bandwidth of the grating. In a transmissive grating, the profile of the output beam is pretty close to the profile of the input beam. In a reflective grating with a single frequency (Yes, I know it's impossible) the output spectrum would be a Dirac delta function due to the Bragg selectivity. However, real reflective gratings have a central frequency where the majority of the Bragg planes are of the same distance, with some lower frequencies and some higher. The exact profile of the output depends on the statistical variation of the Bragg spacing within the grating and gives rise to a Gaussian function (almost - it's not exactly Gaussian). On the Gaussian model, the sigma of the curve gives the FWHM. As a matter of fact, the profile of the output is the Fourier Transform of the Bragg plane distribution function within the emulsion.
PinkysBrain

Diffraction gratings and the diffraction limit

Post by PinkysBrain »

Dinesh wrote:I'm not sure exactly what your question is. Are you asking what the connection between the Abbe limit and a holographic grating is?
Sorry, wasn't clear. I meant the projection from the grating on a surface at some given distance. It will generate lines with the same spacing as the grating, with the width of the lines decreasing as the number of slits covered by the beam increases (multiple slit version of Young's experiment). With the Frauenhofer approximation usually used to calculate the shape of the resulting pattern there is no real limit to how narrow the lines can get with increasing number of slits ... so I was wondering where the diffraction limit for the line width came from in this case.

A little googling though shows that the Fresnel approximation isn't valid for near/sub wavelength features, so neither is the Frauenhofer approximation nor Fourier optics.
Dinesh

Diffraction gratings and the diffraction limit

Post by Dinesh »

Ok, I see what you mean.

Well, let's start with a line of coherent oscillators. Let's say a row of electrons since you can't go much smaller than that (well, you can have a row of quarks, but this ain't a paper on QCD - though a row of oscillating quarks is an interesting idea!) Sorry, I digress. If you assume that there are N electrons arranged in a row at a separation of d from each other and the distance from the centre of this row to the screen is R, then the E field at some distant point is:

E = E(0)*(exp(i(kR - wt))*(sin(N*delta/2)/sin(delta/2))
where:
E(0) = amplitude of laser beam
w = frequency of laser
k = 2*pi/lambda (propagation vector of the laser)
delta = k*d*sin(theta)
theta = direction of screen from centre of row of electrons.

The intensity variation is then the |E*E| and is:

I = I(0)*(sin^2(N*delta/2)/sin^2(delta/2)

The numerator oscillates fairly rapidly with respect to the denominator because of the factor of N in the sin^2 function. Thus, this is a series of sharp, bright lines surrounded by lines of lower intensity. These sharp lines occur whenever

delta = 2*n*pi

Since delta = k*d*sin(theta), you derive the grating equation d*sin(theta) = n*lambda. Also,

(sin^2(N*delta/2)/sin^2(delta/2) = N^2
(by L'Hopital's rule)

So the principal maxima, the sharp bright lines, have an intensity of N^2*I(0). Now notice that if the electrons were closer together than the lambda of the laser (d < lambda), then the condition for maxima has only one value, theta = 0, and you'd only get a single line of light normal to the electrons whose intensity would be N^2*I(0). The width of the line would probably fall off as a sin^2 function. We could go into the difference between a Gaussian and a sin^2 by examining {|exp(-x^2/a) - (exp(i*phi) - exp(-i*phi))|}, but let me just say from an intuitive point of view that it'll probably be close to Gaussian. That's a single row of electrons spaced at intervals of d (which may or may not be smaller than lambda). If we now allow the number of electrons to become infinite, with each electron oscillating only by an infinitesimal amount, then this becomes a continuous line of oscillators and you'd integrate over the line to get the final field strength.

A single slit of width a and length l can be analysed by assuming the origin of the diffraction is by rows of an infinite number of electron oscillators across the width of the slit and arranged in columns down the length of the slit. In this case, carrying out the integration, the intensity profile is:

I(theta) = I(0)* sinc^2(b)
where b = pi*D/lambda
D = length of slit.

This is the well known sinc squared function with a very high broad central peak and much smaller (~4%) dimmer peaks surrounding it. The width of the line is now the zeroes of the sinc function, which occur whenever b = m*pi, which would give the width as 2*m*pi

If we now go onto multiple slits (your scenario), where you have slits of width a separated by a distance of s, in other words, the centre of one slit is a distance s from the centre of the next slit, you'd have to do two integrations. One integration is across each slit and the other is the phase contribution from N slits. In this case, the intensity distribution is

I(theta) = I(0)*sinc^2(q)*(sin^2(N*p)/sin^2(p) (running out of letters!)
where q = (k*a/2)* sin(theta)
p = (k*s/2)*sin(theta)
k = 2*pi/lambda

Notice now that the expression for a single slit and the expression for a row of idealised electron oscillators have combined.So now you have a bright broad band, surrounded by smaller dimmer bands ( as for a single slit) and superimposed on them is a series of thin, sharp lines (as for a line of idealised electron oscillators). Notice also that if the slit width became very, very small, then the sinc^2 function approximates to 1 and the pattern reduces to that of a single line of idealised electrons. The minima of the function for the slits occur whenever (sin^2(N*p)/sin^2(p) = 0, or when

p = +/-pi/N, +/- 2*pi/N, +/- 3*pi/N etc.
So, the width of the central maxima could be approximated to 2*pi/N

For the Fresnel approximation, the screen would be very near and the light from the grating would be fairly divergent. In this situation, the diffraction pattern would be given by the Cornu spiral which tracks the phase variations across the slit. If the slit width became vanishingly small, then I suspect that the small oscillations at the peak of the Cornu spiral would even out and you'd get a broad central band.
PinkysBrain

Diffraction gratings and the diffraction limit

Post by PinkysBrain »

Dinesh wrote:So, the width of the central maxima could be approximated to 2*pi/N
Your on the fly application of scalar diffraction theory is impressive, but this just brings me back to my original question ... where does the diffraction limit on the focusing power come from? I guess that scalar diffraction theory breaks down even in the far field in this case. Can you do it again with full Maxwell equations? :)
Dinesh

Diffraction gratings and the diffraction limit

Post by Dinesh »

PinkysBrain wrote:Can you do it again with full Maxwell equations?
Well, in a sense, I was. The problem is that the question, as is, is very difficult because it depends so much on the actual conditions. Fraunhofer gives a pretty good expression for the conditions under which Fraunhofer is valid. That is (roughly), Fraunhofer:

d > a^2/lambda
where d = source-aperture distance or aperture screen distance (which ever is smaller)
a = aperture diameter

Of course, if d is infinite the Fraunhofer condition will always be satisfied, regardless of lambda or the aperture diameter. If the condition is not satisfied and you have to use Fresnel approximations, perhaps the best way to tackle it is either by the Cornu spiral method or using the Fresnel-Kirchoff integral. The former is still approximate because the phase variations that make up the Cornu spiral may not be smooth, ie the "spiral" may not be a spiral at all. If you actually used the real phase variations, you'd probably end up with a Fresnel Kirchoff integral anyway. If you use the Fresnel-Kirchoff formulation, you need an "aperture function". This means that the aperture function will have to be idealised; so that, for a series of slits, the aperture function could be a series of delta functions, but real slits are not delta functions - they have rough edges. Now, you need to place boundary conditions on the integral, which makes it even more complicated. The perfect situation would be to model a real diffraction screen by convolving the delta functions with some sort of "edge function" and then using this in the Fresnel Kirchoff integral. The validity of this would depend on how accurate your "edge function" would be.

If now you replaced the dielectric material of the diffracting screen with a conducting screen, you'd have to take into the currents generated in the the conducting medium. You can account for the the currents by approximating an idealised conductor (infinite number of free electrons). If you did this, you'd have to assume that all free charges were on the surface. Then, using Maxwell, you'd find the E field for both perpendicular polarisation and transverse polarisation. This would give you the currents on the surface of the conductor. However, the currents themselves would not be uniform and so the expressions for the currents would involve a Fourier Transform (They don't have to. You can always use an integral representation with any set of basis functions, a la QM). In the end, the currents, which create the new diffracted fields, would be complex and you'd have to do a complex integration. This means specifying a path in i-space and assuming certain infinities (to get the residues).

As you can see, finding the diffraction limit of a real set of diffracting elements, without the assumptions of Fraunhofer, Fresnel or Kirchoff is bloody difficult! It can be done, but you'd have to consult journals of Theoretical Physics to get the details. In this sense, you may be right that Fresnel and Fraunhofer break down without exact boundary conditions.

On the other hand, the Fraunhofer approximations works well for most real gratings. If you're diffracting a raw HeNe laser beam (or even a narrow line from a cylindrical lens), then the source aperture is the diameter (or width) of the beam, let's say it's about 1mm. Lambda is 633 x 10^(-9). So for Fraunhofer to be valid, the screen where the image is focused has to be a^2/lambda = 1.5 m or about 8 or 9 feet. Not unreasonable in most situations. The diffraction limit under these conditions would be the width of the diffracted maxima, since the diffraction slit is assumed to be infinitely thin and composed of an infinite number of electrons all perfectly bound. In reality, broadening will occur because a real aperture has a finite thickness and there will also be some unbound electrons - nothing is purely a condustor or purely a die-electric . Thus collision effects of electrons within the aperture, Doppler broadening due to thermal effects (I assume you're not carrying out this experiment in absolute zero! Joking! Joking!) and currents generated by the E-field for both polarisations states will broaden the pattern. In addition, the assumption is that the electron oscillators are pure harmonic oscillators. In reality, inter-atomic forces do not allow precise harmonic oscillation. The electron is assumed to oscillate harmonically because the dip in the Van der Waals potential is close to the harmonic potential near the origin. So, the exponentials used in the calculation themselves are approximated due to the assumptions of harmonic potentials that don't really exist.

With all this in mind, let me start with the fact that Maxwell's equations can be reduced to the wave equation:

d/dx(dE/dx) = 1/c^2*d/dt(dE/dt) This is, of course partial, but I have no way of typing partial derivatives.
assuming a solution of
E = f(x)g(t)
gives a solution for the wave equation of
E = E(0)exp(i(k.r - wt)

giving rise to the description of the wave as an harmonic oscillator in time and space. The Principal of Superposition allows us to combine several waves from several sources by simply adding up the E-fields. This solution comes directly from Maxwell - albeit with no current sources and in a vaccuum.

This is a single wave of infinite extent with a single frequency, which is impossible. So, again, the actual wave is a superposition of several components giving a wave packet. You have to integrate over all the frequency components (or wave vector components) in the wave packet to calculate the real propagation of a real wave. In a real medium, you also have dispersion caused by local (bound) current sources, which needs to be taken into account.

However, proceeding in our happy, happy world of total perfection, the diffracted beam is the superposition of a series of exponentials, each coming from one electron in a thin, perfect dielectric, perfectly harmonic mode:

E = E(0){exp(i(kr_1 - wt)) + exp(i(kr_2 - wt)) + ... + exp(i(kr_N - wt))}
where all the r's are the distance from the electron to the point of observation.

This can be expressed:

E = E(0)*exp(i(kr_1-wt_))* {1 + exp(i*delta) + exp(i*delta)^2 + exp(i*delta)^3 + ... exp(i*delta)^(N-1)}

This is a geometric series which sums to:
(exp(i*delta*N) - 1)/(exp(i*delta - 1) = exp(i*(N-1)*delta/2)*(sin(N*delta/2))/sin(delta/2)
Using exp(i*x) = cos(x) + i*sin(x) and exp(-i*x) = cos(x) - i*sin(x)

If you plug this back into the expression for for the E field, you'll eventually get the expression for the diffraction pattern of a thin line of electron oscillators. In this way, the diffraction field is directly obtained from Maxwell. You might argue that it it's possible to get a real diffracted field from a real grating by starting right back from Maxwell, and this would be true. However, as i mentioned above, this would lead to Fresnel-Kirchoff integrals and complex-plane integration, all of which can be traced back to Maxwell's solution of exp(i*(k.r - wt)) and the principal of superpostion; except that the superposition is a much more complex one than simple addition of harmonic functions.

At any rate, as mentioned previously, this result can then be used to derive the expression for diffraction from a series of parallel slits giving the expression I gave earlier:

I(theta) = I(0)*sinc^2(q)*(sin^2(N*p)/sin^2(p)

In this case, the diffraction limited linewidth is given by the minima of the function for the slits which occur whenever (sin^2(N*p)/sin^2(p) = 0, or when

p = +/-pi/N, +/- 2*pi/N, +/- 3*pi/N etc.
So, the width of the central maxima, the diffraction limited linewidth, is 2*pi/N

If it helps, I could give you the derivation of the diameter of the Airy disc, which is usually considered to be the diffraction limited spot from a lens. The theory is developed in the same way (superposition of ideal harmonic oscillators froma thin disc of electrons in an aperture, or lens), but now a circular coordinate system is used. In this case, the diffraction pattern is a Bessel function and is:

I(theta) = I(0){2J(1)(k*a*sin(theta))/k*a*sin(theta)
where J(1) is a Bessel function of the first order.

This shows a central peak surrounded by a dark ring with subsidiary peaks of lower intensity. The radius of the central peak (the diffraction limit of a lens) is given by the zeroes of the Bessel function. The first zero of the bessel function J(1)(u) occurs when u = 3.83 (from tables). If the radius of the first disk (the diffraction limit, remember) is r, then

sin(theta) = r/R
where R = distance to screen from lens. And so:
k*a*sin(theta))/k*a*sin(theta) = k*a*r/R = 3.83

Thus, the radius of the first zone of the Airy disc is

r = 1.22 lambda*R/2*a

For a lens, R = focal length and a = radius of lens, so the diffraction limitted radius of a lens is:

r = 1.22 f*lambda/D
where D = diameter of lens.

As you can see, this is derived in exactly the same way (except that it's circular) and we gloss over the details of a real lens made of a real material with rough edges (and scratches all over it, if my lenses are representaticve!)
PinkysBrain

Diffraction gratings and the diffraction limit

Post by PinkysBrain »

With the airy disc for refractive lenses though we have a lower bound on the spot size because of physically realizeable refraction angles, 2pi/N does not seem to have that ... it can't just be precision, keeping path length to sub wavelength precision over relatively large distances is done all the time in semiconductor manufacturing, or holography recording for that matter.

I wonder if scalar diffraction would give more realistic results if the grating was brick wall low pass filtered at 1/wavelength?
Dinesh

Diffraction gratings and the diffraction limit

Post by Dinesh »

PinkysBrain wrote:With the airy disc for refractive lenses though we have a lower bound on the spot size because of physically realizeable refraction angles
Well, not quite. The lower bound is not due to the physically realisable refraction angles, because those angles are determined on a ray model and calculated by Snell assuming that lambda = 0 (The ray model assumption). The lower bound is due to diffractive effects, which are a physical phenomenon caused on the edge of the lens by the fact of diffraction which is lambda dependent. On the ray model, there is no diffraction because there is no lambda.

Mathematically, the diffraction pattern of any obstacle is given by the Fresnel-Kirchoff integral. That is, given an obstacle around which light diffracts due to it's finite wavelength, the wavefunction of the light impinging on the obstacle is converted by diffractive effects to a new wavefunction given by the Fresnel-Kirchoff integral of the "incoming " wavefunction convolved with an aperture function, or "obstacle function. In the far field, the optical wavefunction is the FT of the input wave.For the Fresnel-Kirchoff formulation, the input wave is assumed to be unity, hence, from an operational viewpoint, the output of the diffracting structure - the aperture - is the "impulse response" of the aperture. In the case of a lens, the "incoming" wavefunction is assumed to be collimated and the "obstacle" is assumed to be a circular disc in an otherwise opaque screen. The collimated light is simply an exponential. However, in order to model the circular opening, you need a so-called "circ function":
circ(sq rt(x^2 + y^2))) = 1 for sq rt(x^2 + y^2)) < 1
= 1/2 for sq rt(x^2 + y^2)) =1
= 0 otherwise

The Ft of such a function, converted to polar coords is known as the Fourier-Bessel transformation and is a Bessel function. The zeroes of the Bessel function gives the Airy diameter.
PinkysBrain wrote:2pi/N does not seem to have that .
So, in the case of several slits in otherwise opaque screen, the basic model is still the same. You have an aperture function (the slits), you have the incoming wavefunction (assumed coillimated and hence an exponential) and you have the Fresnel-Kirchoff integral comprising of the aperture function and the diffrraction pattern is the impulse response of this particular aperture. The only difference is that the aperture function for slits is different to that of the lens. The fact that the lens is made of glass (for example) is also trivially absorbed by absorbing it into the aperture function. It's relatively easy to replace the (empty) slits in a grating with, for example, glass and so replace the slits of a grating with a series of cylindrical lenses. The aperture function changes, but the image of the system is still an FT due to diffractive effects. From a pure ray picture of zero lambda, the image would be a series of Dirac functions; due to diffractive effects, the slits will enlarge into bands whose widths would be a convolution of 2p/N and a Bessel function.
PinkysBrain wrote:keeping path length to sub wavelength precision over relatively large distances is done all the time in semiconductor manufacturing, or holography recording for that matter.
I'm not sure what you mean, but I suspect you may be confusing fringe coherence with path length differences. I'm not familiar semiconductor manufacturing, but in holography the path length difference needs to be within the coherence length (temporal coherence limit) of the laser. This is a function of the bandwidth of the laser output and is about 1 MHz at the lower end - giving a coherence length of a couple of inches. For display holographers, the upper limit of coherence length is usually 10's of metres. I've often heard the statement that the paths "need to be within half a wavelength" and is not true. A laser with a coherence length of half a wavelength would be impossible to use for holography. What does have to be within "half a wavelength" is the range of allowable motion of the fringe system during recording. The "half a wavelength" is actually half a fringe width, not the actual wavelength of the laser. So, shooting a single beam Denisyuk with a HeNe would give a fringe width of 633/(2n). If n = 1.5, the fringe width is 211nm and the fringes cannot move more than 105 nm during recording. This same laser shooting a transmission hologram with a 30 degree reference gives a fringe width of 1266nm and, this time, the fringes cannot move by more than 633nm during recording. It's relatively straightforward to calculate the physical range of motion of an actual optic, since the range of motion must not cause a phase difference of more than half the spatial frequency of the fringes. Thus if a mirror moves by a distance delta d relative to a fixed mirror at s from the plate, the range of allowable motion of the mirror is:

k(delta d - s) < pi/2.

and the response frequency of the table and/or it's components must be such that

k(delta d - s)/t(exp) < pi/(2*T9exp)
where t(exp) is the exposure time.
PinkysBrain wrote:I wonder if scalar diffraction would give more realistic results if the grating was brick wall low pass filtered at 1/wavelength?
Well, the brick wall filter is the FT of a rect function which is a sinc function so if the aperture function was a sinc function, then the grating would be a brickwall low pass. However, you're still using scalar diffraction and the Fresnel-Kirchoff integral to calculate the output. The level of accuracy would depend on the level of accuracy of the slits conforming to a sinc function.

I think one source of confusion here may be the word "accuracy". The physical world is modeled by mathematics according to certain assumption, both physical and mathematical. Insofar as an actual result of an actual measurement is concerned, the "accuracy" may be considered as the error between the result derived mathematically and the measurement itself. However, the mathematical result is based on a mathematical model with an idealised physical basis. Since no actual physical system will ever conform to the idealisation on which the mathematics is based, there will always be inaccuracies. If the result of computation is C and the experimental result is E, then the trick of it is:

|(C - E) | < epsilon(physical system)

Is there a way to minimise epsilon? Well, there may be a mathematical way, but again based on an idealised physical basis! The uncertainties in a real physical system will swamp whatever idealisations you make! Perhaps, the best you can do is to determine epsilon using some kind of Cauchy criterion and use whatever mathematical model gets you closest to the epsilon.

In the end, the question is: Is there a Platonic ideal which can be modeled mathematically and give a vanishing epsilon? I think quantum mechanics drove the last nail into that coffin!
PinkysBrain

Diffraction gratings and the diffraction limit

Post by PinkysBrain »

Dinesh wrote:Well, not quite. The lower bound is not due to the physically realisable refraction angles, because those angles are determined on a ray model and calculated by Snell assuming that lambda = 0 (The ray model assumption). The lower bound is due to diffractive effects, which are a physical phenomenon caused on the edge of the lens by the fact of diffraction which is lambda dependent. On the ray model, there is no diffraction because there is no lambda.
Well at least in the formula f/D puts a limit on the spot size, a limit which we can determine based on the concept of refraction ... physical or not.

A limit which 2pi/N defacto lacks for the line width in a grating diffraction pattern ... which I assume is not remotely close to physically correct, getting deep sub wavelength features can't be that easy.

What limits the line width?
Post Reply