[Dnsmasq-discuss] pxe-service entries in dnsmasq conf seem to fail non-proxy EFI boot

Shrenik Bhura shrenik.bhura at gmail.com
Thu Sep 30 10:47:13 UTC 2021

 > 1. seems to have wrong pcap file or it does not use configuration
attached in linked archive. It seems it offers menu items from 2. archive
with custom pxe-services.

Apologies, there was definitely some mistake.

We have applied the patch and tried with and without dhcp-no-override but
it still fails to boot. Herein are the pcap and the logs for this case.

Additionally, also included is the qemu pcap wherein it does boot

On Wed, 29 Sept 2021 at 20:29, Petr Menšík <pemensik at redhat.com> wrote:

> It is somehow hard to guess described results for each configuration (1.
> 2. 3.). It is unclear to me, what you saw for each variant printed by the
> computer.
> 1. seems to have wrong pcap file or it does not use configuration attached
> in linked archive. It seems it offers menu items from 2. archive with
> custom pxe-services.
> Option 43 Suboption: (9) PXE boot menu
>     Length: 41
>     boot menu:
> 8000155058454c494e555820285838362d36345f4546492980010e5058454c494e555820…
>         Type: Unknown (32768)
>         Length: 21
>         Description: PXELINUX (X86-64_EFI)
>         Type: Unknown (32769)
>         Length: 14
>         Description: PXELINUX (EFI)
> Above is not present in config file presented for it, but in 2. Are you
> sure you have killed dnsmasq and started it again?
> I think it might be difference between pxe-service served file chosen via
> menuboot. I have noticed there are two way to specify file to boot in DHCP
> for IPv4. One is in fixed header and first try chosen from menu is in that.
> pxe-service options makes it to request direct query to DHCP server, marked
> proxyDHCP in wireshark. This proxy ACK is followed by TFTP.
> I used filter in wireshark: "dhcp or (!tftp.destination_file && tftp)"
> However following DHCP offers boot file path ONLY in option 67 value.
> Fixed header boot file is all zeroed. It seems to me this is the part the
> snponly.efi firmware does not understand. It does not try to use path in
> option, but may insist only on file. Since option #52 overload is not in
> packet, I guess dnsmasq should have used mess->file for path and not option
> 67. But rules of rfc2131.c:2476 are simple. If client have requested option
> 67, it should handle it as option 67. I guess it is bug in snponly.efi.
> Either it should not include option 67 between requested options or it
> should actually handle the option. Dnsmasq would offer boot path in both
> cases.
> Interesting enough, dnsmasq is inconsistent with itself. It behaves a bit
> different way in PXE proxy mode, where file header part is always used. In
> normal mode unless --dhcp-no-override is used, option is used if requested.
> Can you please try if dhcp-no-override option would fix your issues? I
> think it should behave the same way in both situations.
> I attached patch, which would set boot file on pxe-service the same way as
> dhcp-boot. It may require dhcp-no-override where it did not before. Could
> you please try it?
> On 9/28/21 11:54, Shrenik Bhura wrote:
> Hi Petr,
> As per your guidance, we have enabled logging (LOG_ALL in
> config/consolle.h) and recompiled the ipxe binaries. Below are the latest
> observations.
> Taking down the scenarios from the previous post for ease of reference -
> 1. Default dnsmasq config with default ltsp's pxe-service entries -
> https://drive.google.com/file/d/1-BGnZw4RMAuIbJudVA2D4a1vasNeAd1j/view?usp=sharing
> 2. Custom pxe-service entries (just to prove that pxe-service and
> dhcp-boot do seem to successfully co-exist) -
> https://drive.google.com/file/d/1-CjHXxlKmYw-9aOTD7xK8m5uAdj4qyAB/view?usp=sharing
> 3. Without pxe-service entries -
> https://drive.google.com/file/d/1-6Q_1Fg6zVVNruzQTJjxvmKRRkRnCBmh/view?usp=sharing
> I'll try to summarise the understanding and prevailing ambiguities thus
> far to help allot responsibility of multiple things that may be going wrong
> here :
> Between scenario (1) and (2), we see that ltsp.ipxe is being served in (2)
> which doesn't happen in (1).
> In (1), the primary issue is that EFI clients do not receive snponly.efi,
> thus they do not advertise option 175 and hence are not sent the ltsp.ipxe.
> Since it has not got to the iPXE stage as yet, there are no logs available
> from ipxe.  All that is visible momentarily on the client side is these two
> lines -
> *Station IP address is *
> *PXE-E21: Remote boot cancelled.*
> Quoting from an explanation herein [1] for "Remote boot cancelled" -
> *" This message is also displayed when a DHCP/proxyDHCP server sends a
> menu that auto-selects Local Boot and when a bootserver sends a bootstrap
> program that returns control to the PXE LoadFile protocol. "*
> In scenario (2), PXE boot menu is displayed as defined in the pxe-service
> lines, option 175 is received back from the client, ltsp.ipxe is sent but
> is not "downloaded" by the client. There is nothing reported in the ipxe
> logs. On the client, the last line says -
> No more network devices.
> But, above all, if we simply comment out all the pxe-service lines, as in
> scenario (3), including the one with tag:rpi, the EFI clients boot up
> perfectly. iPXE log has -
> ipxe: Downloaded "ltsp.ipxe"
> ipxe: Executing "ltsp.ipxe"
> ipxe: Downloaded "vmlinuz"
> ipxe: Downloaded "ltsp.img"
> ipxe: Downloaded "initrd.img"
> ipxe: Executing "vmlinuz"
> The question thus arises that why does dnsmasq not ignore the pxe-service
> lines which have an unmatched "tag:proxy" or "tag:rpi" when dnsmasq is
> operating in non-proxy mode? Or does it ignore and yet there is a problem
> outside dnsmasq? With respect to scenario (1), there could be a problem in
> the UEFI implementation, with respect to (2), there could be an issue with
> iPXE but what we can immediately control within dnsmasq is to ignore lines
> of pxe-service with tags that have not been set.
> Your thoughts?
> [1]
> https://techpubs.jurassic.nl/manuals/hdwr/enduser/SG750_UG/sgi_html/ch04.html
> On Mon, 27 Sept 2021 at 22:56, Petr Menšík <pemensik at redhat.com> wrote:
>> Hello,
>> I made a mistake when reading the code. You are right. The part I
>> mentioned is only affected on vendor-class information option 43, only in
>> DHCPREQUEST or DHCPINFORM. Which is not in request in pcap you have sent.
>> It seems to me problem is somewhere on IPXE side in decoding reply
>> dnsmasq sent to it. I took a look at the second offer of both without-pxe
>> and default-ltsp. It seems the only difference is in vendorclass
>> information containing PXE menu. Without pxe continues to TFTP, where
>> default is stuck. The answer is on its decoding side. Assignment got the
>> same boot file successfully in both configurations. I am afraid it would be
>> problem at PXE decoding client, which may not understand menu dnsmasq tried
>> to send.
>> According to option 43 decoding in wireshark, pxe suboptions look well.
>> Except suboption 9 boot menu. Type unknown 0x8000 does seem weird, but
>> should be just Vendor use according to IBM docs [1]. Why it did not do
>> anything else should be answered by ipxe people. It should continue after 2
>> seconds even without any action. Did it display at least boot menu on that
>> station? Did it show anything? Are those machines with normal VGA output?
>> Perhaps LOG_LEVEL in PXE [2] might reveal true reason.
>> Cheers,
>> Petr
>> 1.
>> https://www.ibm.com/docs/en/aix/7.2?topic=daemon-pxe-vendor-container-suboptions
>> 2. https://ipxe.org/buildcfg/log_level
>> On 9/27/21 16:04, Shrenik Bhura wrote:
>> Hello Petr,
>> Thanks for your guidance.
>> It does seem that dhcp-boot is being reached even when pxe-service is
>> successfully executed. Taking a hint from this discussion on UEFI and PXE (
>> https://bbs.archlinux.org/viewtopic.php?id=237655), we tried this custom
>> configuration -
>> pxe-prompt="Press any key for boot menu",2
>> pxe-service=X86-64_EFI,"PXELINUX (X86-64_EFI)",ltsp/snponly.efi
>> pxe-service=7,"PXELINUX (EFI)",ltsp/snponly.efi
>> dhcp-boot=tag:!iPXE,tag:X86PC,ltsp/undionly.kpxe
>> dhcp-boot=tag:!iPXE,tag:X86-64_EFI,ltsp/snponly.efi
>> dhcp-boot=tag:iPXE,ltsp/ltsp.ipxe
>> (full file attached below)
>> Server does proceed to offering ltsp.ipxe to the client via dhcp but is
>> eventually not being transferred via tftp.
>> Have attached logs, pcap and dnsmasq configuration of three scenarios -
>> 1. Default dnsmasq config with default ltsp's pxe-service entries
>> 2. Custom pxe-service entries
>> 3. Without pxe-service entries
>> We have tested these with two systems - Intel NUC and Dell Optiplex 3040
>> with their updated firmware and have found the same results.
>> I hope this helps to zoom further into the problem area.
>> Best regards,
>> Shrenik
>> On Mon, 27 Sept 2021 at 17:00, Petr Menšík <pemensik at redhat.com> wrote:
>>> Hi Alkis,
>>> It would be helpful, if you could record pcap with those lines commented
>>> out and enabled. It seems suspicious dhcp-boot option is present at the
>>> same time with pxe-service. From what I undestood, pxe-service should
>>> offer boot options only to PXEClient vendor string. I think it saves you
>>> the need to dhcp-match=set:X86PC,option:client-arch,0
>>> then matched in
>>> dhcp-boot=tag:!iPXE,tag:X86PC,ltsp/undionly.kpxe
>>> dhcp-boot=tag:iPXE,tag:X86PC,ltsp/ltsp.ipxe
>>> I just checked my Raspberry 3. I guess architecture of RPi in DHCP
>>> request is clearly wrong. Unfortunately it reports it wrong also in
>>> vendorclass ARCH:0000.
>>> Anyway, it might not handle tags correctly. Around src/rfc2131.c:891, it
>>> searches for pxe service without using tags. It is not used to find
>>> correct service, just to find correct context.
>>> Also it seems if any pxe-service is defined and incoming DHCP packet
>>> contains PXEClient in VendorClass option, it MUST be handled by
>>> pxe-service. If no correct service & context is found, reply is not
>>> handled for it. It cannot fall back to normal DHCP reply in that case,
>>> which can be fixed. But current situation seems to me clear. If any
>>> pxe-service is present, all PXEClient packets has to be handled by it.
>>> It seems to me you define tags per arch anyway, so I guess you can avoid
>>> pxe-service just fine.
>>> I made an attempt to respond to PXE request only when correct service
>>> matches. But I have no setup prepared for it, I tested just it compiles.
>>> Could you try it would help?
>>> Cheers,
>>> Petr
>>> On 3/19/21 10:05, Alkis Georgopoulos wrote:
>>> > Hi all,
>>> >
>>> > I'm one of the LTSP developers; I asked Shrenik to contact the dnsmasq
>>> > mailing list because I feel this might be a dnsmasq issue.
>>> >
>>> > Specifically, success or failure depends on whether these five lines
>>> > are commented out or not:
>>> >
>>> >
>>> #pxe-service=tag:proxy,tag:!iPXE,X86PC,"undionly.kpxe",ltsp/undionly.kpxe
>>> >
>>> #pxe-service=tag:proxy,tag:!iPXE,X86-64_EFI,"snponly.efi",ltsp/snponly.efi
>>> >
>>> > #pxe-service=tag:proxy,tag:iPXE,X86PC,"ltsp.ipxe",ltsp/ltsp.ipxe
>>> > #pxe-service=tag:proxy,tag:iPXE,X86-64_EFI,"ltsp.ipxe",ltsp/ltsp.ipxe
>>> > #pxe-service=tag:rpi,X86PC,"Raspberry Pi Boot   ",unused
>>> >
>>> > You may find the full configuration files and logs at:
>>> > https://github.com/ltsp/ltsp/pull/417
>>> >
>>> > The reason I feel it might be a dnsmasq issue, is that these tags are
>>> > NOT matched in Shrenik's use case. He's not using proxy mode and he's
>>> > not booting a Raspberry Pi.
>>> >
>>> > So, "pxe-service" lines that are NOT matched, cause the problem,
>>> > yet if they're commented out, the problem is gone...
>>> >
>>> > Would that be an issue with dnsmasq, or with the UEFI PXE stack?
>>> >
>>> > Thanks,
>>> > Alkis Georgopoulos
>>> >
>>> > _______________________________________________
>>> > Dnsmasq-discuss mailing list
>>> > Dnsmasq-discuss at lists.thekelleys.org.uk
>>> >
>>> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
>>> >
>>> --
>>> Petr Menšík
>>> Software Engineer
>>> Red Hat, http://www.redhat.com/
>>> email: pemensik at redhat.com
>>> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
>>> _______________________________________________
>>> Dnsmasq-discuss mailing list
>>> Dnsmasq-discuss at lists.thekelleys.org.uk
>>> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
>> --
>> Petr Menšík
>> Software Engineer
>> Red Hat, http://www.redhat.com/
>> email: pemensik at redhat.com
>> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
>> --
> Petr Menšík
> Software Engineer
> Red Hat, http://www.redhat.com/
> email: pemensik at redhat.com
> PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/attachments/20210930/68e44d96/attachment-0001.htm>

More information about the Dnsmasq-discuss mailing list