Forno WSS outage since last week 4/3

Hi guys,

Not sure if cLabs is aware, but forno websocket WSS has been dead ever since last sunday 4/3.
I switched all of my systems over to fully using private nodes for websocket connection but it would be nice to know whats going on with Forno WSS and whether there’s plans to bring it back?

Thanks!

1 Like

We did notice a unusual spike last week and it was disabled. We will be re-enabling shortly. Thanks for reporting it

1 Like

Cloudflare has pretty limited ability to throttle or block excessive websocket connections once the initial HTTP handshake is done.

One way to control for websocket abuse is to shorten the socket lifetime to much shorter, like 5 minutes, and force the client to re-connect. If a client has to reconnect every 5 minutes, then it’s easier to write cloudflare WAF rule to rate limit how many “/ws” requests are made from a single IP per hour.

Then you can set up a rate limit on /ws to 100 requests per hour per IP and prevent a single IP from holding onto large number of websocket conns.

4 Likes

I noticed that Forno WSS is back online now, thanks!

What kind of additional websocket protections were put in place to prevent future abuses?

1 Like

Hi @diwu1989! Thanks for the posts. As you observed websocket service was restored last week, after disabling to guarantee the service for celo connect.

We are exploring better options to ensure forno service on both endpoints (rpc and websockets) with a fair usage from all users, but for now we can better control the QOS on rpc endpoints, so my recommendations now would be to use https instead of wss as default endpoint. Your tip makes a lot of sense and was one of the options we have thought about. Also we thought on adding some metrics that can help us to identify the clients flooding websocket connections.

What are forno’s rpc.gascap and rpc.evmtimeout values set to?
The defaults out of the box are too high and can be easily resource abused.

gascap default is 50m gas which is 2x a full celo block, something much lower like 10m is a sane limit that still allows for real user calls.

evmtimeout is also set to 5s by default, which is a lot of CPU time to allocate for a single request, something much lower like 1s will prevent someone from saturating the CPU with non-sense calls.

1 Like

Good call. Forno are using the default celo-blockchain values for those parameters, so probably as you are suggesting we can optimize those parameters to discard very high-demanding requests.
Forno config is open source and is based on this helm chart. If you are curious specifically about geth cmdline options you can find most of it in this file. I will check your suggestions and prepare a PR with the suggested changes. Thanks @diwu1989 :pray: