Technology will make your life easier

RogerBW · 24 May 2022 18:10

Many years ago I went on a PRINCE2 course, in which they admitted that doing the whole rigmarole would have about a 10-20% overhead, but that this was probably better than you’d get out of a project that went wrong.

pillbox · 24 May 2022 23:44

I bought a ThinkPad T470, used, from a buy-sell-trade post in our neighborhood’s bulletin board. It is my… 6th^(?) Thinkpad and definitely very nice. But I very much prefer things my T420s has that this does not.

Namely, instead of a keyboard backlight, my T420s has a built-in light at the top of the screen bezel, which you can use to illuminate the keyboard or, say, anything you place on the keyboard. Whereas the T470 has a dual-brightness traditional backlight, that you can see more light around the keys than you can through the keycaps; effectively worthless.

Additionally, the T420s had page-forward and page-backward keys near the direction keys; using these became like second nature for navigating the web; in the same place on the T470 is the PgUp and PgDn buttons… which has lead to some confusion and disorientation.

I’m currently running the factory-reset-available Windows 10 that came with the laptop, but will be looking into linux compatibility for the hardware soonish.

dscheidt · 25 May 2022 02:27

the ThinkLight. My t42 was called 'lightbulb" because of it. I also had scripts to flash it in different patterns to alert me to different events. I’m not cool enough to have ever learned Morse, so it was just basic stuff.

RogerBW · 25 May 2022 08:24

The T430 on which I’m typing this came with ThinkLight as standard, and I got the keyboard backlight as well (which I prefer when I’m working in the dark). Newer beast is a weird sort of hybrid “P17” near the end of its production run; nVidia graphics alas, but otherwise very nice, keyboard backlight only. And for travel I have an Edge E145 that I SSD-ed because nobody makes small keyboarded laptops like that any more and it’s fast enough for everything except the modern web, but it doesn’t have either. (All running Debian/stable.)

dscheidt · 27 May 2022 15:10

A network failure mode I have not seen before, very condensed version.

I have some machines somewhere. they have dual ethernet links, one connected to switch A, one to switch B. They’re both active, traffic is split between them based on stuff. If one link goes down, everything goes to the other. the switches are vxlan tunnel endpoints, but my machines use old fashioned vlans. vxlan traffic through one of the switches wasn’t arriving. local vlan traffic to the same destinations worked. lots of troublesxhooting, resetting, blah, blah. Switch hardware vendor got involved. Root cause: every 60th bit in vxlan packets was being inverted. hardware replaced, everything works.

pillbox · 27 May 2022 15:17

That’s a deeper RCA than I’ve ever seen before on network gear

tomm_archer · 27 May 2022 16:01

In what I can only assume is a cost-cutting measure, IT went through a process a while back of uninstalling Microsoft Visio from computers where it hadn’t been used in over 6 months. If you begged, you used to be able to get it installed/re-installed.

Now their suggested tool is Inkscape. I like Inkscape, but a replacement for Visio it is not. New members of my team will now be unable to edit Visio diagrams.

pillbox · 27 May 2022 16:14

I’ve worked for company where my requests to get Visio installed miraculously kept getting mis-routed or ‘accidentally’ closed. So whenever I was asked for the required Visio design for a project, I just shrugged my shoulders and said, “I don’t have Visio installed” and CC my supervisor, and then my supervisor would field the inevitable “have someone else do it” with the “each team member has their assigned workload based on forecasting and expected LOE, we cannot reassign tasks”.

Be sure to create approved templates and stencils only in Visio

dscheidt · 27 May 2022 17:42

It’s a big honking switch, with as many as 72 100G ports. (they buy the switch in multiple configurations, I don’t know which these are. Doesn’t matter to my stuff. Just give me my ports.) Vendors get a little touchy when you tell them it needs to be replaced, and want some assurance it’s really their fault, and not something else. The last one of these i got pulled into was a software bug that caused similar problems.

COMaestro · 28 May 2022 00:34

Microsoft Office programs have just gotten more complex with their compatibility issues, I think in an effort to get people to move to O365.

Used to be, you’d buy a copy of Office (or Visio or Project) and install it and be done. Then they made the split between click-to-run and KMS based installs, and they could not be used together (if you had KMS Office, you could not install a Click-to-Run Visio or Project). As O365 became more prevalent the incompatibilities only increased.

yashima · 31 May 2022 14:51

Can I abuse you lovely techies here as a Stackoverflow replacement?

In any case. The scenario includes a rest client server communication with timeouts. How would you make a server aware that the client has decided the server was taking too long and generate a timeout on the client side? Is it possible to do without callbacks from client to server?

This is a more lengthy explanation of the issue:

A user on Server A sends some data for an “order” to Server B via rest.
Server B begins processing “order” but there is a hiccup and it takes a long time.
Server A decides there must have been a communication error and runs into the configured timeout. Server A calls Server B to request a roll back of “order” just in case and then Server A forgets about “order”
Meanwhile, Server B is still processing “order” when the roll back request arrives, the roll back does nothing because “order” does not exist yet due to whatever processing hiccup is going on. So Server B keeps processing “order” at some point processing is done and “order” is stored on Server B without Server B able to realize “order” is now orphaned.
Back on Server A the “order” does not exist, so user decides to generate a new one for the same content and this one goes through and produces a duplicate.

This is basically transactions over http/rest communication. An alternative solution would have us call remote ejbs instead but this would mean implementing a completely new thing.

Just by chance, can anyone tell me what the preferred rest solution to this is? I’ll be googling a lot in the next couple days. The project’s architect suggested this ticket was a “desperation generation engine” and seemed relieved someone else grabbed that one from the sprint and that someone is me

PS: alle the methods are void, changing them means changing a versioned interface … which I am told is to be avoided. It is not a great implementation but it is what we have now.

Phil · 1 June 2022 00:17

Is it possible to do without callbacks from client to server?

I think the two options are:

client tells server that’s it’s given up (the callback option)
client and server agree in advance on a maximum duration before client gives up

The second option would need to be tied to a timestamp that is the same for both hosts (offhand, I’d think the server would generate that timestamp when responding to the client’s original request, and the client would receive that response, and both sides would be counting from that timestamp).

Your problem is partly a serialisation issue, yes? Avoiding having a later “cancellation” request processed prior to the original “order” request? The server might queue the incoming requests and process them one at a time, such that requests from a given client are guaranteed to be processed in the same order they arrived in.

That still doesn’t allow for regular networking mishaps where things arrive in a different order to how they were sent though, so you might consider something similar to what TCP does (which is include explicit sequencing numbers so that the sequence can be reconstructed in order at the other end – so your ‘queue’ might actually allow for late arrivals of items which need to “jump the queue”). Perhaps that’s a non-issue with a REST API though… everything has a request and a response, and the client presumably won’t be issuing any “cancellation” requests without having already received a response for the preceeding “order” request, so this paragraph might be an entirely unnecessary level of complication.

Ok, that’s my offhand hand-waving (off-hand-waving?)

yashima · 1 June 2022 07:39

Thank you so much. That is actually giving me more than one idea
It helps so much breaking down a problem and trying to present it to someone who doesn’t know the specifics.

Phil · 1 June 2022 07:56

Re-reading that, I’ve realised that it presents the fun scenario of “client and server clocks are dramatically out of sync” (which is something I’ve never had to deal with rigorously; I’ve never needed to care), but AFAIK clock synchronisation is necessarily going to be somewhat fuzzy and non-trivial if you’re trying to estimate latency with a good degree of accuracy. You may or may not care a lot about it, but you probably want to make some minimal efforts (e.g. if the client receives a message from the server with a “when the server sent this” timestamp in the future – even though some time has elapsed since then due to network latency – then you probably don’t want to trust that the clocks are in sync…)

yashima · 1 June 2022 08:01

Yes, the timestamp has that problem that while we assume ntp works perfectly and is used everywhere this is not a given.

So while one might use a timestamp for some heuristic bugfixing in this case, my other thought goes to: why not write the rollback somewhere it can be found and processed later. So whenever an order is “stored” after that the rollbacks are checked… no fancy async code needed and then rollback can still happen as it should. I have no idea why it isn’t done that way. One would need to make sure the rollback cache/queue is checked regularly for even more orphans… but I could see this done on the server without having to touch any interfaces or change technologies.

edit 2: I realize this is still only an approximation of transactional safety, because obviously the Rollback Request can also fail to make it through. But at this point it seems a 90% solution is acceptable

I also want to investigate “connection reset by peer” that I saw in the log yesterday when I killed my client in the middle of a request.

edit: In the long run, it might be necessary to switch away from “void” methods. I don’t like these as part of webservices. They are mostly unhelpful in gauging if stuff worked successfully.

tomm_archer · 1 June 2022 16:19

Testing has started for our next software release so of course the issues with the build machine have returned.

I’m currently on the fifth attempt of a single pipeline. I had two runs fail (early on) because of the virus checker. I had another run fail (an hour in) because the build machine got rebooted and another run fail (four hours in) because the connection to the network went down for twenty minutes.

Glad I wasn’t in the office today, the air would have been very blue.

Edit: And the 5th failed a couple of hours in for the same reason as the fourth. Looks like I’m logging in on my day off tomorrow.

yashima · 2 June 2022 07:51

I just want to say this: I hate whitespace.

Context: for local testing we run our code in docker containers. Since we have a variety of scenarios involving multiple instances (see my server A / B problem above–has not yet been worked on due to the fuckup I’ll be describing here) of our code communicating with each other, we have different setups of docker containers…

I inherited the yml files in question from the colleague who setup the docker stuff for the project. The yml files were a bit messy and a while ago I decided to clean them up. I had everything up and running at last. I came back from vacation and it still worked.

Then suddenly in the middle of Tuesday afternoon right after a rebuild of all project files… one of the more complex scenarios stopped working. I reverted my changes and build again. No change. Usually this means my local routing is fucked up because windows sucks that way. Reboot wsl. Nothing. Reboot router. Nothing. Reboot windows, twice. Nothing. Rebuild again. Check that all files are there, fresh and have the correct rw rights. All fine. But still not working. Ah whatever…

I don’t need the scenario at this point. So I keep working. Wednesday afternoon, my changes for the new build on Thursday are finished. I just need to check the complex “cross” scenario. OMG, I remember it wasn’t working. Ah, well, that was probably a fluke. Repeat previous paragraph. Not working.

Desperation.

By that time I was largely alone in the “office” (meaning everyone but me and the one late-working colleague were offline in teams)… and I had a change to push that needs to go into the build today.

More desperation. I needed that docker scenario to execute the tests that would make sure my changes were fine.

… Colleague said to to push my changes anyway since the nightly build would find all the problems. So I did and thanked him for having my back.

Come to work this morning, all the expected tests failed (that I should have been able to execute and fix before the nightly).

Call “docker specialist”… we look at my configs: they look fine to both of us and yet that one node is not working. (Not working meaning the logfile looks good the node is all booted up but the routing from windows to the node is still foo-bared.)

I had figured out that the same node was working in a slightly different setup. And we diffed those 2 configs.

Beyond Compare marked one line as “different” that looked absolutely identical to the human eye. There must have been a whitespace character there that was not the same.

I look at my colleague: I copied this from the tried and true config I have stashed in the IDE, this cannot be it. But it copied the line from the working config anyway and tried.

And it works.

I hate whitespace.

PS: the mistake here being obviously that the yml files are not under version control

RogerBW · 2 June 2022 09:49

You won’t be wanting the Whitespace programming language, then, in which the only significant characters are spaces, tabs and linefeeds.

(I like YAML a lot, at least the sensible parts of it, but space/tab can definitely cause problems…)

yashima · 2 June 2022 09:56

Definitely.

There are some other factors that contributed

my wireless keyboard sometimes disconnects when I do not type for a while and I am used to things not immediately happening when I start typing somewhere and… sometimes I start typing in the wrong window and do not notice…
I use SublimeText which–outside of my killing it–keeps unsaved changes over multiple sessions but I have not configured it to autosave.

So that space/tab may have been there for a while unnoticed until I made another change in the file that I intended to do and saved manually.

I dislike languages that use indentation depth for blocks instead of braces. Yes, yes, braces are more verbose and all but they are easier to figure out by humans and generally better supported in editors.

RogerBW · 2 June 2022 10:06

I’ve been doing light weekly programming challenges in a variety of languages to help me get more fluent. The thing that particularly grates for me in significant-whitespace is ending multiple blocks at once: where a C-descended bracey language might say

if (condition) {
  for (loop) {
     magic();
  }
}
more_magic();

Python wants

if (condition):
  for (loop):
    magic()
more_magic()

and particularly if I’m already a bit deeper than top level it’s not visually obvious just how many blocks I’m dropping out of.